根据列表索引选择熊猫行-Python 实用宝典

问题：根据列表索引选择熊猫行

我有一个数据框df：

   20060930  10.103       NaN     10.103   7.981
   20061231  15.915       NaN     15.915  12.686
   20070331   3.196       NaN      3.196   2.710
   20070630   7.907       NaN      7.907   6.459

然后，我想选择列表中指示的具有某些序列号的行，假设这里是[1,3]，然后向左移：

   20061231  15.915       NaN     15.915  12.686
   20070630   7.907       NaN      7.907   6.459

如何或什么功能可以做到这一点？

I have a dataframe df :

   20060930  10.103       NaN     10.103   7.981
   20061231  15.915       NaN     15.915  12.686
   20070331   3.196       NaN      3.196   2.710
   20070630   7.907       NaN      7.907   6.459

Then I want to select rows with certain sequence numbers which indicated in a list, suppose here is [1,3], then left:

   20061231  15.915       NaN     15.915  12.686
   20070630   7.907       NaN      7.907   6.459

How or what function can do that ?

回答 0

List = [1, 3]
df.ix[List]

应该做的把戏！当我用数据帧建立索引时，我总是使用.ix（）方法。它是如此容易和灵活…

UPDATE 这不再是可接受的索引编制方法。该ix方法已弃用。使用.iloc基于整数索引和.loc基于标签索引。

ind_list = [1, 3]
df.ix[ind_list]

should do the trick! When I index with data frames I always use the .ix() method. Its so much easier and more flexible…

UPDATE This is no longer the accepted method for indexing. The ix method is deprecated. Use .iloc for integer based indexing and .loc for label based indexing.

回答 1

您还可以使用iloc：

df.iloc[[1,3],:]

如果由于先前的计算，如果数据框中的索引与行的顺序不对应，则此方法将无效。在这种情况下，请使用：

df.index.isin([1,3])

…如其他回应所建议。

you can also use iloc:

df.iloc[[1,3],:]

This will not work if the indexes in your dataframe do not correspond to the order of the rows due to prior computations. In that case use:

df.index.isin([1,3])

… as suggested in other responses.

回答 2

另一种方法（尽管它是更长的代码），但是比上面的代码要快。使用％timeit函数检查它：

df[df.index.isin([1,3])]

PS：您找出原因

Another way (although it is a longer code) but it is faster than the above codes. Check it using %timeit function:

df[df.index.isin([1,3])]

PS: You figure out the reason

回答 3

对于大型数据集，通过skiprows参数仅读取选定的行会节省内存。

例

pred = lambda x: x not in [1, 3]
pd.read_csv("data.csv", skiprows=pred, index_col=0, names=...)

现在，这将从文件中返回一个DataFrame，该文件将跳过除1和3之外的所有行。

细节

从文档：

skiprows ：类似于列表或整数或可调用的列表，默认 None

…

如果可调用，则将针对行索引评估可调用函数，如果应跳过该行，则返回True，否则返回False。有效的可调用参数的一个示例是lambda x: x in [0, 2]

此功能适用于熊猫0.20.0+版本。另请参见相应的问题和相关文章。

For large datasets, it is memory efficient to read only selected rows via the skiprows parameter.

Example

pred = lambda x: x not in [1, 3]
pd.read_csv("data.csv", skiprows=pred, index_col=0, names=...)

This will now return a DataFrame from a file that skips all rows except 1 and 3.

Details

From the docs:

skiprows : list-like or integer or callable, default None

…

If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be lambda x: x in [0, 2]

This feature works in version pandas 0.20.0+. See also the corresponding issue and a related post.

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

根据列表索引选择熊猫行

问题：根据列表索引选择熊猫行

回答 0

回答 1

回答 2

回答 3

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

使用get_or_create的正确方法？

Py-spy-Python程序的采样分析器

获取总计熊猫列

从十六进制编码的ASCII字符串转换为纯ASCII？

如何将制表符完成添加到Python Shell？

熊猫：求和给定列的DataFrame行