如何从一列中排序熊猫数据框-Python 实用宝典

问题：如何从一列中排序熊猫数据框

我有一个像这样的数据框：

print(df)

        0          1     2
0   354.7      April   4.0
1    55.4     August   8.0
2   176.5   December  12.0
3    95.5   February   2.0
4    85.6    January   1.0
5     152       July   7.0
6   238.7       June   6.0
7   104.8      March   3.0
8   283.5        May   5.0
9   278.8   November  11.0
10  249.6    October  10.0
11  212.7  September   9.0

如您所见，月份不是按日历顺序排列的。因此，我创建了第二列以获取与每月（1-12）相对应的月份号。从那里，如何根据日历月的顺序对数据框进行排序？

I have a data frame like this:

print(df)

        0          1     2
0   354.7      April   4.0
1    55.4     August   8.0
2   176.5   December  12.0
3    95.5   February   2.0
4    85.6    January   1.0
5     152       July   7.0
6   238.7       June   6.0
7   104.8      March   3.0
8   283.5        May   5.0
9   278.8   November  11.0
10  249.6    October  10.0
11  212.7  September   9.0

As you can see, months are not in calendar order. So I created a second column to get the month number corresponding to each month (1-12). From there, how can I sort this data frame according to calendar months’ order?

回答 0

用于sort_values按特定列的值对df进行排序：

In [18]:
df.sort_values('2')

Out[18]:
        0          1     2
4    85.6    January   1.0
3    95.5   February   2.0
7   104.8      March   3.0
0   354.7      April   4.0
8   283.5        May   5.0
6   238.7       June   6.0
5   152.0       July   7.0
1    55.4     August   8.0
11  212.7  September   9.0
10  249.6    October  10.0
9   278.8   November  11.0
2   176.5   December  12.0

如果要按两列进行排序sort_values，请将列标签列表传递给，并根据排序优先级对列标签进行排序。如果使用df.sort_values(['2', '0'])，结果将按列2然后按列排序0。当然，对于这个示例，这实际上没有任何意义，因为其中的每个值df['2']都是唯一的。

Use sort_values to sort the df by a specific column’s values:

In [18]:
df.sort_values('2')

Out[18]:
        0          1     2
4    85.6    January   1.0
3    95.5   February   2.0
7   104.8      March   3.0
0   354.7      April   4.0
8   283.5        May   5.0
6   238.7       June   6.0
5   152.0       July   7.0
1    55.4     August   8.0
11  212.7  September   9.0
10  249.6    October  10.0
9   278.8   November  11.0
2   176.5   December  12.0

If you want to sort by two columns, pass a list of column labels to sort_values with the column labels ordered according to sort priority. If you use df.sort_values(['2', '0']), the result would be sorted by column 2 then column 0. Granted, this does not really make sense for this example because each value in df['2'] is unique.

回答 1

我尝试了上述解决方案，但没有取得结果，因此我找到了一个对我有用的解决方案。该升=假是订购数据框在递减顺序，默认为真。我正在使用python 3.6.6和pandas 0.23.4版本。

final_df = df.sort_values(by=['2'], ascending=False)

您可以在此处查看pandas文档中的更多详细信息。

I tried the solutions above and I do not achieve results, so I found a different solution that works for me. The ascending=False is to order the dataframe in descending order, by default it is True. I am using python 3.6.6 and pandas 0.23.4 versions.

final_df = df.sort_values(by=['2'], ascending=False)

You can see more details in pandas documentation here.

回答 2

只是添加一些对数据的操作。假设我们有一个数据框df，我们可以执行几个操作以获得所需的输出

ID         cost      tax    label
1       216590      1600    test      
2       523213      1800    test 
3          250      1500    experiment

(df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending=False)

将sorted标签输出作为dataframe

    index   label
0   test        2
1   experiment  1

Just adding some more operations on data. Suppose we have a dataframe df, we can do several operations to get desired outputs

ID         cost      tax    label
1       216590      1600    test      
2       523213      1800    test 
3          250      1500    experiment

(df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending=False)

will give sorted output of labels as a dataframe

    index   label
0   test        2
1   experiment  1

回答 3

就像另一个解决方案：

您可以对字符串数据（月份名称）进行分类并按如下方式进行排序：

df.rename(columns={1:'month'},inplace=True)
df['month'] = pd.Categorical(df['month'],categories=['December','November','October','September','August','July','June','May','April','March','February','January'],ordered=True)
df = df.sort_values('month',ascending=False)

它将month name按照您在创建Categorical对象时指定的顺序为您提供排序的数据。

Just as another solution:

Instead of creating the second column, you can categorize your string data(month name) and sort by that like this:

df.rename(columns={1:'month'},inplace=True)
df['month'] = pd.Categorical(df['month'],categories=['December','November','October','September','August','July','June','May','April','March','February','January'],ordered=True)
df = df.sort_values('month',ascending=False)

It will give you the ordered data by month name as you specified while creating the Categorical object.

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

如何从一列中排序熊猫数据框

问题：如何从一列中排序熊猫数据框

回答 0

回答 1

回答 2

回答 3

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

元组比较在Python中如何工作？

我在python中遇到关键错误

如何在python中生成动态（参数化）单元测试？

使用numpy构建两个数组的所有组合的数组

Django-oscar 快速搭建商城网站

对一组值进行排序[关闭]

如何从一列中排序熊猫数据框

问题：如何从一列中排序熊猫数据框

回答 0

回答 1

回答 2

回答 3

相关文章

排行榜展示

文章展示