问题:将Pandas Multi-Index转换为专栏

我有一个具有2个索引级别的数据框:

                         value
Trial    measurement
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

我想变成这样:

Trial    measurement       value

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

我怎样才能最好地做到这一点?

我需要这样做是因为我想按照此处的指示汇总数据,但是如果将它们用作索引,则无法选择这样的列。

I have a dataframe with 2 index levels:

                         value
Trial    measurement
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

Which I want to turn into this:

Trial    measurement       value

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

How can I best do this?

I need this because I want to aggregate the data as instructed here, but I can’t select my columns like that if they are in use as indices.


回答 0

所述reset_index()是一个数据帧熊猫方法,将索引值转移到数据帧为列。参数的默认设置为drop = False(将索引值保留为列)。

您只需.reset_index(inplace=True)在DataFrame名称后添加:

df.reset_index(inplace=True)  

The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns).

All you have to do add .reset_index(inplace=True) after the name of the DataFrame:

df.reset_index(inplace=True)  

回答 1

这并不是真的适用于您的情况,但可能有助于其他人(例如5分钟前的我自己)知道。如果一个人的多重数具有如下相同的名称:

                         value
Trial        Trial
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

df.reset_index(inplace=True) 将会失败,因为创建的列不能具有相同的名称。

因此,您需要将multindex重命名为df.index = df.index.set_names(['Trial', 'measurement'])

                           value
Trial    measurement       

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

然后df.reset_index(inplace=True)将像魅力一样工作。

在按年和月对名为datetime的列(不是索引)进行分组之后,我遇到了这个问题live_date,这意味着年和月都被命名了live_date

This doesn’t really apply to your case but could be helpful for others (like myself 5 minutes ago) to know. If one’s multindex have the same name like this:

                         value
Trial        Trial
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

df.reset_index(inplace=True) will fail, cause the columns that are created cannot have the same names.

So then you need to rename the multindex with df.index = df.index.set_names(['Trial', 'measurement']) to get:

                           value
Trial    measurement       

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

And then df.reset_index(inplace=True) will work like a charm.

I encountered this problem after grouping by year and month on a datetime-column(not index) called live_date, which meant that both year and month were named live_date.


回答 2

正如@ cs95在评论中提到的,要仅降低一个级别,请使用:

df.reset_index(level=[...])

这样可以避免在重置后必须重新定义所需的索引。

As @cs95 mentioned in a comment, to drop only one level, use:

df.reset_index(level=[...])

This avoids having to redefine your desired index after reset.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。