问题:重命名熊猫DataFrame索引

我有一个没有标头的csv文件,带有DateTime索引。我想重命名索引和列名,但是使用df.rename()仅重命名了列名。虫子?我正在使用0.12.0版本

In [2]: df = pd.read_csv(r'D:\Data\DataTimeSeries_csv//seriesSM.csv', header=None, parse_dates=[[0]], index_col=[0] )

In [3]: df.head()
Out[3]: 
                   1
0                   
2002-06-18  0.112000
2002-06-22  0.190333
2002-06-26  0.134000
2002-06-30  0.093000
2002-07-04  0.098667

In [4]: df.rename(index={0:'Date'}, columns={1:'SM'}, inplace=True)

In [5]: df.head()
Out[5]: 
                  SM
0                   
2002-06-18  0.112000
2002-06-22  0.190333
2002-06-26  0.134000
2002-06-30  0.093000
2002-07-04  0.098667

I’ve a csv file without header, with a DateTime index. I want to rename the index and column name, but with df.rename() only the column name is renamed. Bug? I’m on version 0.12.0

In [2]: df = pd.read_csv(r'D:\Data\DataTimeSeries_csv//seriesSM.csv', header=None, parse_dates=[[0]], index_col=[0] )

In [3]: df.head()
Out[3]: 
                   1
0                   
2002-06-18  0.112000
2002-06-22  0.190333
2002-06-26  0.134000
2002-06-30  0.093000
2002-07-04  0.098667

In [4]: df.rename(index={0:'Date'}, columns={1:'SM'}, inplace=True)

In [5]: df.head()
Out[5]: 
                  SM
0                   
2002-06-18  0.112000
2002-06-22  0.190333
2002-06-26  0.134000
2002-06-30  0.093000
2002-07-04  0.098667

回答 0

rename方法采用适用于索引的索引字典。
您想重命名为索引级别的名称:

df.index.names = ['Date']

考虑这一点的一种好方法是,列和索引是同一类型的对象(IndexMultiIndex),您可以通过转置来互换二者。

这有点令人困惑,因为索引名称与列具有相似的含义,因此这里有更多示例:

In [1]: df = pd.DataFrame([[1, 2, 3], [4, 5 ,6]], columns=list('ABC'))

In [2]: df
Out[2]: 
   A  B  C
0  1  2  3
1  4  5  6

In [3]: df1 = df.set_index('A')

In [4]: df1
Out[4]: 
   B  C
A      
1  2  3
4  5  6

您可以在索引上看到重命名,它可以更改 1:

In [5]: df1.rename(index={1: 'a'})
Out[5]: 
   B  C
A      
a  2  3
4  5  6

In [6]: df1.rename(columns={'B': 'BB'})
Out[6]: 
   BB  C
A       
1   2  3
4   5  6

重命名级别名称时:

In [7]: df1.index.names = ['index']
        df1.columns.names = ['column']

注意:此属性只是一个列表,您可以将其重命名为列表理解/映射。

In [8]: df1
Out[8]: 
column  B  C
index       
1       2  3
4       5  6

The rename method takes a dictionary for the index which applies to index values.
You want to rename to index level’s name:

df.index.names = ['Date']

A good way to think about this is that columns and index are the same type of object (Index or MultiIndex), and you can interchange the two via transpose.

This is a little bit confusing since the index names have a similar meaning to columns, so here are some more examples:

In [1]: df = pd.DataFrame([[1, 2, 3], [4, 5 ,6]], columns=list('ABC'))

In [2]: df
Out[2]: 
   A  B  C
0  1  2  3
1  4  5  6

In [3]: df1 = df.set_index('A')

In [4]: df1
Out[4]: 
   B  C
A      
1  2  3
4  5  6

You can see the rename on the index, which can change the value 1:

In [5]: df1.rename(index={1: 'a'})
Out[5]: 
   B  C
A      
a  2  3
4  5  6

In [6]: df1.rename(columns={'B': 'BB'})
Out[6]: 
   BB  C
A       
1   2  3
4   5  6

Whilst renaming the level names:

In [7]: df1.index.names = ['index']
        df1.columns.names = ['column']

Note: this attribute is just a list, and you could do the renaming as a list comprehension/map.

In [8]: df1
Out[8]: 
column  B  C
index       
1       2  3
4       5  6

回答 1

当前选择的答案未提及rename_axis可用于重命名索引和列级别的方法。


重命名索引级别时,Pandas具有一些古怪之处。还有一个新的DataFrame方法rename_axis可用于更改索引级别名称。

让我们看一下DataFrame

df = pd.DataFrame({'age':[30, 2, 12],
                       'color':['blue', 'green', 'red'],
                       'food':['Steak', 'Lamb', 'Mango'],
                       'height':[165, 70, 120],
                       'score':[4.6, 8.3, 9.0],
                       'state':['NY', 'TX', 'FL']},
                       index = ['Jane', 'Nick', 'Aaron'])

在此处输入图片说明

对于行索引和列索引,此DataFrame都有一个级别。行索引和列索引都没有名称。让我们将行索引级别的名称更改为“名称”。

df.rename_axis('names')

在此处输入图片说明

rename_axis方法还可以通过更改axis参数来更改列级别名称:

df.rename_axis('names').rename_axis('attributes', axis='columns')

在此处输入图片说明

如果使用某些列设置索引,则列名称将成为新的索引级别名称。让我们将索引级别附加到原始DataFrame上:

df1 = df.set_index(['state', 'color'], append=True)
df1

在此处输入图片说明

注意原始索引是如何没有名称的。我们仍然可以使用,rename_axis但需要向其传递与索引级别数相同长度的列表。

df1.rename_axis(['names', None, 'Colors'])

在此处输入图片说明

您可以None用来有效地删除索引级别名称。


系列工作类似,但有所不同

让我们创建一个具有三个索引级别的系列

s = df.set_index(['state', 'color'], append=True)['food']
s

       state  color
Jane   NY     blue     Steak
Nick   TX     green     Lamb
Aaron  FL     red      Mango
Name: food, dtype: object

我们可以rename_axis像使用DataFrames一样使用

s.rename_axis(['Names','States','Colors'])

Names  States  Colors
Jane   NY      blue      Steak
Nick   TX      green      Lamb
Aaron  FL      red       Mango
Name: food, dtype: object

请注意,该系列下面还有一个元数据,称为 Name。从DataFrame创建系列时,此属性设置为列名。

我们可以将字符串名称传递给rename方法以进行更改

s.rename('FOOOOOD')

       state  color
Jane   NY     blue     Steak
Nick   TX     green     Lamb
Aaron  FL     red      Mango
Name: FOOOOOD, dtype: object

DataFrames没有此属性,如果这样使用,事实上会引发异常

df.rename('my dataframe')
TypeError: 'str' object is not callable

在熊猫0.21之前,您可能曾用于rename_axis重命名索引和列中的值。它已被弃用,所以不要这样做

The currently selected answer does not mention the rename_axis method which can be used to rename the index and column levels.


Pandas has some quirkiness when it comes to renaming the levels of the index. There is also a new DataFrame method rename_axis available to change the index level names.

Let’s take a look at a DataFrame

df = pd.DataFrame({'age':[30, 2, 12],
                       'color':['blue', 'green', 'red'],
                       'food':['Steak', 'Lamb', 'Mango'],
                       'height':[165, 70, 120],
                       'score':[4.6, 8.3, 9.0],
                       'state':['NY', 'TX', 'FL']},
                       index = ['Jane', 'Nick', 'Aaron'])

enter image description here

This DataFrame has one level for each of the row and column indexes. Both the row and column index have no name. Let’s change the row index level name to ‘names’.

df.rename_axis('names')

enter image description here

The rename_axis method also has the ability to change the column level names by changing the axis parameter:

df.rename_axis('names').rename_axis('attributes', axis='columns')

enter image description here

If you set the index with some of the columns, then the column name will become the new index level name. Let’s append to index levels to our original DataFrame:

df1 = df.set_index(['state', 'color'], append=True)
df1

enter image description here

Notice how the original index has no name. We can still use rename_axis but need to pass it a list the same length as the number of index levels.

df1.rename_axis(['names', None, 'Colors'])

enter image description here

You can use None to effectively delete the index level names.


Series work similarly but with some differences

Let’s create a Series with three index levels

s = df.set_index(['state', 'color'], append=True)['food']
s

       state  color
Jane   NY     blue     Steak
Nick   TX     green     Lamb
Aaron  FL     red      Mango
Name: food, dtype: object

We can use rename_axis similarly to how we did with DataFrames

s.rename_axis(['Names','States','Colors'])

Names  States  Colors
Jane   NY      blue      Steak
Nick   TX      green      Lamb
Aaron  FL      red       Mango
Name: food, dtype: object

Notice that the there is an extra piece of metadata below the Series called Name. When creating a Series from a DataFrame, this attribute is set to the column name.

We can pass a string name to the rename method to change it

s.rename('FOOOOOD')

       state  color
Jane   NY     blue     Steak
Nick   TX     green     Lamb
Aaron  FL     red      Mango
Name: FOOOOOD, dtype: object

DataFrames do not have this attribute and infact will raise an exception if used like this

df.rename('my dataframe')
TypeError: 'str' object is not callable

Prior to pandas 0.21, you could have used rename_axis to rename the values in the index and columns. It has been deprecated so don’t do this


回答 2

对于较新的pandas版本

df.index = df.index.rename('new name')

要么

df.index.rename('new name', inplace=True)

如果数据框应保留其所有属性,则需要后者

For newer pandas versions

df.index = df.index.rename('new name')

or

df.index.rename('new name', inplace=True)

The latter is required if a data frame should retain all its properties.


回答 3

在Pandas 0.13及更高版本中,索引级别名称是不可变的(类型FrozenList),不能再直接设置。您必须首先使用Index.rename()将新的索引级别名称应用到Index,然后再使用DataFrame.reindex()将新的索引应用到DataFrame。例子:

对于熊猫版本<0.13

df.index.names = ['Date']

对于熊猫版本> = 0.13

df = df.reindex(df.index.rename(['Date']))

In Pandas version 0.13 and greater the index level names are immutable (type FrozenList) and can no longer be set directly. You must first use Index.rename() to apply the new index level names to the Index and then use DataFrame.reindex() to apply the new index to the DataFrame. Examples:

For Pandas version < 0.13

df.index.names = ['Date']

For Pandas version >= 0.13

df = df.reindex(df.index.rename(['Date']))

回答 4

您还可以Index.set_names如下使用:

In [25]: x = pd.DataFrame({'year':[1,1,1,1,2,2,2,2],
   ....:                   'country':['A','A','B','B','A','A','B','B'],
   ....:                   'prod':[1,2,1,2,1,2,1,2],
   ....:                   'val':[10,20,15,25,20,30,25,35]})

In [26]: x = x.set_index(['year','country','prod']).squeeze()

In [27]: x
Out[27]: 
year  country  prod
1     A        1       10
               2       20
      B        1       15
               2       25
2     A        1       20
               2       30
      B        1       25
               2       35
Name: val, dtype: int64
In [28]: x.index = x.index.set_names('foo', level=1)

In [29]: x
Out[29]: 
year  foo  prod
1     A    1       10
           2       20
      B    1       15
           2       25
2     A    1       20
           2       30
      B    1       25
           2       35
Name: val, dtype: int64

You can also use Index.set_names as follows:

In [25]: x = pd.DataFrame({'year':[1,1,1,1,2,2,2,2],
   ....:                   'country':['A','A','B','B','A','A','B','B'],
   ....:                   'prod':[1,2,1,2,1,2,1,2],
   ....:                   'val':[10,20,15,25,20,30,25,35]})

In [26]: x = x.set_index(['year','country','prod']).squeeze()

In [27]: x
Out[27]: 
year  country  prod
1     A        1       10
               2       20
      B        1       15
               2       25
2     A        1       20
               2       30
      B        1       25
               2       35
Name: val, dtype: int64
In [28]: x.index = x.index.set_names('foo', level=1)

In [29]: x
Out[29]: 
year  foo  prod
1     A    1       10
           2       20
      B    1       15
           2       25
2     A    1       20
           2       30
      B    1       25
           2       35
Name: val, dtype: int64

回答 5

如果要使用相同的映射来重命名列和索引,则可以执行以下操作:

mapping = {0:'Date', 1:'SM'}
df.index.names = list(map(lambda name: mapping.get(name, name), df.index.names))
df.rename(columns=mapping, inplace=True)

If you want to use the same mapping for renaming both columns and index you can do:

mapping = {0:'Date', 1:'SM'}
df.index.names = list(map(lambda name: mapping.get(name, name), df.index.names))
df.rename(columns=mapping, inplace=True)

回答 6

df.index.rename('new name', inplace=True)

是唯一为我完成工作的人(熊猫0.22.0)。
如果没有inplace = True,则在我的情况下不会设置索引名称。

df.index.rename('new name', inplace=True)

Is the only one that does the job for me (pandas 0.22.0).
Without the inplace=True, the name of the index is not set in my case.


回答 7

您可以使用index和的columns属性pandas.DataFrame。注意:列表元素的数量必须与行/列的数量匹配。

#       A   B   C
# ONE   11  12  13
# TWO   21  22  23
# THREE 31  32  33

df.index = [1, 2, 3]
df.columns = ['a', 'b', 'c']
print(df)

#     a   b   c
# 1  11  12  13
# 2  21  22  23
# 3  31  32  33

you can use index and columns attributes of pandas.DataFrame. NOTE: number of elements of list must match the number of rows/columns.

#       A   B   C
# ONE   11  12  13
# TWO   21  22  23
# THREE 31  32  33

df.index = [1, 2, 3]
df.columns = ['a', 'b', 'c']
print(df)

#     a   b   c
# 1  11  12  13
# 2  21  22  23
# 3  31  32  33

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。