通过整数索引选择一行熊猫系列/数据框-Python 实用宝典

问题：通过整数索引选择一行熊猫系列/数据框

我很好奇，为什么df[2]不支持，而df.ix[2]与df[2:3]这两个工作。

In [26]: df.ix[2]
Out[26]: 
A    1.027680
B    1.514210
C   -1.466963
D   -0.162339
Name: 2000-01-03 00:00:00

In [27]: df[2:3]
Out[27]: 
                  A        B         C         D
2000-01-03  1.02768  1.51421 -1.466963 -0.162339

我希望df[2]以df[2:3]与Python索引约定一致的方式进行工作。是否有设计原因不支持按单个整数索引行？

I am curious as to why df[2] is not supported, while df.ix[2] and df[2:3] both work.

In [26]: df.ix[2]
Out[26]: 
A    1.027680
B    1.514210
C   -1.466963
D   -0.162339
Name: 2000-01-03 00:00:00

In [27]: df[2:3]
Out[27]: 
                  A        B         C         D
2000-01-03  1.02768  1.51421 -1.466963 -0.162339

I would expect df[2] to work the same way as df[2:3] to be consistent with Python indexing convention. Is there a design reason for not supporting indexing row by single integer?

回答 0

回显@HYRY，请参阅0.11中的新文档

http://pandas.pydata.org/pandas-docs/stable/indexing.html

在这里，我们有了新的运算符，.iloc以明确支持仅整数索引，并且.loc明确支持仅标签索引

例如，想象这种情况

In [1]: df = pd.DataFrame(np.random.rand(5,2),index=range(0,10,2),columns=list('AB'))

In [2]: df
Out[2]: 
          A         B
0  1.068932 -0.794307
2 -0.470056  1.192211
4 -0.284561  0.756029
6  1.037563 -0.267820
8 -0.538478 -0.800654

In [5]: df.iloc[[2]]
Out[5]: 
          A         B
4 -0.284561  0.756029

In [6]: df.loc[[2]]
Out[6]: 
          A         B
2 -0.470056  1.192211

[] 仅对行进行切片（按标签位置）

echoing @HYRY, see the new docs in 0.11

http://pandas.pydata.org/pandas-docs/stable/indexing.html

Here we have new operators, .iloc to explicity support only integer indexing, and .loc to explicity support only label indexing

e.g. imagine this scenario

In [1]: df = pd.DataFrame(np.random.rand(5,2),index=range(0,10,2),columns=list('AB'))

In [2]: df
Out[2]: 
          A         B
0  1.068932 -0.794307
2 -0.470056  1.192211
4 -0.284561  0.756029
6  1.037563 -0.267820
8 -0.538478 -0.800654

In [5]: df.iloc[[2]]
Out[5]: 
          A         B
4 -0.284561  0.756029

In [6]: df.loc[[2]]
Out[6]: 
          A         B
2 -0.470056  1.192211

[] slices the rows (by label location) only

回答 1

DataFrame索引运算符的主要目的`[]`是选择列。

当索引运算符传递字符串或整数时，它将尝试查找具有该特定名称的列并将其作为Series返回。

因此，在上述问题中：df[2]搜索与整数值匹配的列名2。该列不存在，并且KeyError引发a。

使用切片符号时，DataFrame索引运算符完全更改行为以选择行

奇怪的是，当给定切片时，DataFrame索引运算符选择行，并且可以按整数位置或按索引标签来选择行。

df[2:3]

这将从整数位置为2的行开始切为3，最后一个元素除外。因此，只需一行。下面的代码选择从整数位置6开始的行，直到每第三行从20开始但不包括20的行。

df[6:20:3]

如果DataFrame索引中包含字符串，则还可以使用由字符串标签组成的切片。有关更多详细信息，请参见.iloc与.loc上的此解决方案。

我几乎从未将这种切片符号与索引运算符一起使用，因为它不是显式的，而且几乎从未使用过。按行切片时，请坚持使用.loc/.iloc。

The primary purpose of the DataFrame indexing operator, `[]` is to select columns.

When the indexing operator is passed a string or integer, it attempts to find a column with that particular name and return it as a Series.

So, in the question above: df[2] searches for a column name matching the integer value 2. This column does not exist and a KeyError is raised.

The DataFrame indexing operator completely changes behavior to select rows when slice notation is used

Strangely, when given a slice, the DataFrame indexing operator selects rows and can do so by integer location or by index label.

df[2:3]

This will slice beginning from the row with integer location 2 up to 3, exclusive of the last element. So, just a single row. The following selects rows beginning at integer location 6 up to but not including 20 by every third row.

df[6:20:3]

You can also use slices consisting of string labels if your DataFrame index has strings in it. For more details, see this solution on .iloc vs .loc.

I almost never use this slice notation with the indexing operator as its not explicit and hardly ever used. When slicing by rows, stick with .loc/.iloc.

回答 2

您可以将DataFrame视为Series的字典。df[key]尝试通过选择列索引key并返回Series对象。

但是，在[]内切片会对行进行切片，因为这是非常常见的操作。

您可以阅读文档以了解详细信息：

http://pandas.pydata.org/pandas-docs/stable/indexing.html#basics

You can think DataFrame as a dict of Series. df[key] try to select the column index by key and returns a Series object.

However slicing inside of [] slices the rows, because it’s a very common operation.

You can read the document for detail:

http://pandas.pydata.org/pandas-docs/stable/indexing.html#basics

回答 3

要基于索引访问熊猫表，还可以考虑使用numpy.as_array选项将表转换为Numpy数组，方法如下：

np_df = df.as_matrix()

然后

np_df[i]

会工作。

To index-based access to the pandas table, one can also consider numpy.as_array option to convert the table to Numpy array as

np_df = df.as_matrix()

and then

np_df[i]

would work.

回答 4

您可以看一下源代码。

DataFrame具有对_slice()进行切片的私有函数DataFrame，并且它允许参数axis确定要切片的轴。在__getitem__()对DataFrame不设置轴，同时调用_slice()。因此_slice()，默认情况下将其切片为轴0。

您可以进行一个简单的实验，这可能对您有所帮助：

print df._slice(slice(0, 2))
print df._slice(slice(0, 2), 0)
print df._slice(slice(0, 2), 1)

You can take a look at the source code .

DataFrame has a private function _slice() to slice the DataFrame, and it allows the parameter axis to determine which axis to slice. The __getitem__() for DataFrame doesn’t set the axis while invoking _slice(). So the _slice() slice it by default axis 0.

You can take a simple experiment, that might help you:

print df._slice(slice(0, 2))
print df._slice(slice(0, 2), 0)
print df._slice(slice(0, 2), 1)

回答 5

您可以像这样遍历数据帧。

for ad in range(1,dataframe_c.size):
    print(dataframe_c.values[ad])

you can loop through the data frame like this .

for ad in range(1,dataframe_c.size):
    print(dataframe_c.values[ad])

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

通过整数索引选择一行熊猫系列/数据框

问题：通过整数索引选择一行熊猫系列/数据框

回答 0

回答 1

DataFrame索引运算符的主要目的`[]`是选择列。

使用切片符号时，DataFrame索引运算符完全更改行为以选择行

The primary purpose of the DataFrame indexing operator, `[]` is to select columns.

The DataFrame indexing operator completely changes behavior to select rows when slice notation is used

回答 2

回答 3

回答 4

回答 5

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

如何在一次分配中向熊猫数据框添加多列？

在Python中从字符串中剥离HTML

为什么Python的无穷大散列具有π的数字？

Flatbuffers-FlatBuffers：内存效率高的串行化库

检查字典中是否已存在给定键

将Pandas GroupBy输出从Series转换为DataFrame

通过整数索引选择一行熊猫系列/数据框

问题：通过整数索引选择一行熊猫系列/数据框

回答 0

回答 1

DataFrame索引运算符的主要目的[]是选择列。

使用切片符号时，DataFrame索引运算符完全更改行为以选择行

The primary purpose of the DataFrame indexing operator, [] is to select columns.

The DataFrame indexing operator completely changes behavior to select rows when slice notation is used

回答 2

回答 3

回答 4

回答 5

相关文章

排行榜展示

文章展示

DataFrame索引运算符的主要目的`[]`是选择列。

The primary purpose of the DataFrame indexing operator, `[]` is to select columns.