标签归档:columnname

熊猫索引列标题或名称

问题:熊猫索引列标题或名称

如何获取python pandas中的索引列名称?这是一个示例数据框:

             Column 1
Index Title          
Apples              1
Oranges             2
Puppies             3
Ducks               4  

我想做的是获取/设置数据框索引标题。这是我尝试过的:

import pandas as pd
data = {'Column 1'     : [1., 2., 3., 4.],
        'Index Title'  : ["Apples", "Oranges", "Puppies", "Ducks"]}
df = pd.DataFrame(data)
df.index = df["Index Title"]
del df["Index Title"]
print df

有人知道怎么做吗?

How do I get the index column name in python pandas? Here’s an example dataframe:

             Column 1
Index Title          
Apples              1
Oranges             2
Puppies             3
Ducks               4  

What I’m trying to do is get/set the dataframe index title. Here is what i tried:

import pandas as pd
data = {'Column 1'     : [1., 2., 3., 4.],
        'Index Title'  : ["Apples", "Oranges", "Puppies", "Ducks"]}
df = pd.DataFrame(data)
df.index = df["Index Title"]
del df["Index Title"]
print df

Anyone know how to do this?


回答 0

您可以通过其name属性获取/设置索引

In [7]: df.index.name
Out[7]: 'Index Title'

In [8]: df.index.name = 'foo'

In [9]: df.index.name
Out[9]: 'foo'

In [10]: df
Out[10]: 
         Column 1
foo              
Apples          1
Oranges         2
Puppies         3
Ducks           4

You can just get/set the index via its name property

In [7]: df.index.name
Out[7]: 'Index Title'

In [8]: df.index.name = 'foo'

In [9]: df.index.name
Out[9]: 'foo'

In [10]: df
Out[10]: 
         Column 1
foo              
Apples          1
Oranges         2
Puppies         3
Ducks           4

回答 1

您可以使用rename_axis删除设置为None

d = {'Index Title': ['Apples', 'Oranges', 'Puppies', 'Ducks'],'Column 1': [1.0, 2.0, 3.0, 4.0]}
df = pd.DataFrame(d).set_index('Index Title')
print (df)
             Column 1
Index Title          
Apples            1.0
Oranges           2.0
Puppies           3.0
Ducks             4.0

print (df.index.name)
Index Title

print (df.columns.name)
None

新功能在方法链中效果很好。

df = df.rename_axis('foo')
print (df)
         Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

您还可以使用参数重命名列名称axis

d = {'Index Title': ['Apples', 'Oranges', 'Puppies', 'Ducks'],'Column 1': [1.0, 2.0, 3.0, 4.0]}
df = pd.DataFrame(d).set_index('Index Title').rename_axis('Col Name', axis=1)
print (df)
Col Name     Column 1
Index Title          
Apples            1.0
Oranges           2.0
Puppies           3.0
Ducks             4.0

print (df.index.name)
Index Title

print (df.columns.name)
Col Name
print df.rename_axis('foo').rename_axis("bar", axis="columns")
bar      Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

print df.rename_axis('foo').rename_axis("bar", axis=1)
bar      Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

从版本pandas 0.24.0+可以使用参数indexcolumns

df = df.rename_axis(index='foo', columns="bar")
print (df)
bar      Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

删除索引和列名称意味着将其设置为None

df = df.rename_axis(index=None, columns=None)
print (df)
         Column 1
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

如果MultiIndex仅在索引中:

mux = pd.MultiIndex.from_arrays([['Apples', 'Oranges', 'Puppies', 'Ducks'],
                                  list('abcd')], 
                                  names=['index name 1','index name 1'])


df = pd.DataFrame(np.random.randint(10, size=(4,6)), 
                  index=mux, 
                  columns=list('ABCDEF')).rename_axis('col name', axis=1)
print (df)
col name                   A  B  C  D  E  F
index name 1 index name 1                  
Apples       a             5  4  0  5  2  2
Oranges      b             5  8  2  5  9  9
Puppies      c             7  6  0  7  8  3
Ducks        d             6  5  0  1  6  0

print (df.index.name)
None

print (df.columns.name)
col name

print (df.index.names)
['index name 1', 'index name 1']

print (df.columns.names)
['col name']

df1 = df.rename_axis(('foo','bar'))
print (df1)
col name     A  B  C  D  E  F
foo     bar                  
Apples  a    5  4  0  5  2  2
Oranges b    5  8  2  5  9  9
Puppies c    7  6  0  7  8  3
Ducks   d    6  5  0  1  6  0

df2 = df.rename_axis('baz', axis=1)
print (df2)
baz                        A  B  C  D  E  F
index name 1 index name 1                  
Apples       a             5  4  0  5  2  2
Oranges      b             5  8  2  5  9  9
Puppies      c             7  6  0  7  8  3
Ducks        d             6  5  0  1  6  0

df2 = df.rename_axis(index=('foo','bar'), columns='baz')
print (df2)
baz          A  B  C  D  E  F
foo     bar                  
Apples  a    5  4  0  5  2  2
Oranges b    5  8  2  5  9  9
Puppies c    7  6  0  7  8  3
Ducks   d    6  5  0  1  6  0

删除索引和列名称意味着将其设置为None

df2 = df.rename_axis(index=(None,None), columns=None)
print (df2)

           A  B  C  D  E  F
Apples  a  6  9  9  5  4  6
Oranges b  2  6  7  4  3  5
Puppies c  6  3  6  3  5  1
Ducks   d  4  9  1  3  0  5

对于MultiIndexin索引和列,有必要.names改为.name使用list或tuple进行设置:

mux1 = pd.MultiIndex.from_arrays([['Apples', 'Oranges', 'Puppies', 'Ducks'],
                                  list('abcd')], 
                                  names=['index name 1','index name 1'])


mux2 = pd.MultiIndex.from_product([list('ABC'),
                                  list('XY')], 
                                  names=['col name 1','col name 2'])

df = pd.DataFrame(np.random.randint(10, size=(4,6)), index=mux1, columns=mux2)
print (df)
col name 1                 A     B     C   
col name 2                 X  Y  X  Y  X  Y
index name 1 index name 1                  
Apples       a             2  9  4  7  0  3
Oranges      b             9  0  6  0  9  4
Puppies      c             2  4  6  1  4  4
Ducks        d             6  6  7  1  2  8

检查/设置值必须为复数:

print (df.index.name)
None

print (df.columns.name)
None

print (df.index.names)
['index name 1', 'index name 1']

print (df.columns.names)
['col name 1', 'col name 2']

df1 = df.rename_axis(('foo','bar'))
print (df1)
col name 1   A     B     C   
col name 2   X  Y  X  Y  X  Y
foo     bar                  
Apples  a    2  9  4  7  0  3
Oranges b    9  0  6  0  9  4
Puppies c    2  4  6  1  4  4
Ducks   d    6  6  7  1  2  8

df2 = df.rename_axis(('baz','bak'), axis=1)
print (df2)
baz                        A     B     C   
bak                        X  Y  X  Y  X  Y
index name 1 index name 1                  
Apples       a             2  9  4  7  0  3
Oranges      b             9  0  6  0  9  4
Puppies      c             2  4  6  1  4  4
Ducks        d             6  6  7  1  2  8

df2 = df.rename_axis(index=('foo','bar'), columns=('baz','bak'))
print (df2)
baz          A     B     C   
bak          X  Y  X  Y  X  Y
foo     bar                  
Apples  a    2  9  4  7  0  3
Oranges b    9  0  6  0  9  4
Puppies c    2  4  6  1  4  4
Ducks   d    6  6  7  1  2  8

删除索引和列名称意味着将其设置为None

df2 = df.rename_axis(index=(None,None), columns=(None,None))
print (df2)

           A     B     C   
           X  Y  X  Y  X  Y
Apples  a  2  0  2  5  2  0
Oranges b  1  7  5  5  4  8
Puppies c  2  4  6  3  6  5
Ducks   d  9  6  3  9  7  0

和@Jeff解决方案:

df.index.names = ['foo','bar']
df.columns.names = ['baz','bak']
print (df)

baz          A     B     C   
bak          X  Y  X  Y  X  Y
foo     bar                  
Apples  a    3  4  7  3  3  3
Oranges b    1  2  5  8  1  0
Puppies c    9  6  3  9  6  3
Ducks   d    3  2  1  0  1  0

You can use rename_axis, for removing set to None:

d = {'Index Title': ['Apples', 'Oranges', 'Puppies', 'Ducks'],'Column 1': [1.0, 2.0, 3.0, 4.0]}
df = pd.DataFrame(d).set_index('Index Title')
print (df)
             Column 1
Index Title          
Apples            1.0
Oranges           2.0
Puppies           3.0
Ducks             4.0

print (df.index.name)
Index Title

print (df.columns.name)
None

The new functionality works well in method chains.

df = df.rename_axis('foo')
print (df)
         Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

You can also rename column names with parameter axis:

d = {'Index Title': ['Apples', 'Oranges', 'Puppies', 'Ducks'],'Column 1': [1.0, 2.0, 3.0, 4.0]}
df = pd.DataFrame(d).set_index('Index Title').rename_axis('Col Name', axis=1)
print (df)
Col Name     Column 1
Index Title          
Apples            1.0
Oranges           2.0
Puppies           3.0
Ducks             4.0

print (df.index.name)
Index Title

print (df.columns.name)
Col Name
print df.rename_axis('foo').rename_axis("bar", axis="columns")
bar      Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

print df.rename_axis('foo').rename_axis("bar", axis=1)
bar      Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

From version pandas 0.24.0+ is possible use parameter index and columns:

df = df.rename_axis(index='foo', columns="bar")
print (df)
bar      Column 1
foo              
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

Removing index and columns names means set it to None:

df = df.rename_axis(index=None, columns=None)
print (df)
         Column 1
Apples        1.0
Oranges       2.0
Puppies       3.0
Ducks         4.0

If MultiIndex in index only:

mux = pd.MultiIndex.from_arrays([['Apples', 'Oranges', 'Puppies', 'Ducks'],
                                  list('abcd')], 
                                  names=['index name 1','index name 1'])


df = pd.DataFrame(np.random.randint(10, size=(4,6)), 
                  index=mux, 
                  columns=list('ABCDEF')).rename_axis('col name', axis=1)
print (df)
col name                   A  B  C  D  E  F
index name 1 index name 1                  
Apples       a             5  4  0  5  2  2
Oranges      b             5  8  2  5  9  9
Puppies      c             7  6  0  7  8  3
Ducks        d             6  5  0  1  6  0

print (df.index.name)
None

print (df.columns.name)
col name

print (df.index.names)
['index name 1', 'index name 1']

print (df.columns.names)
['col name']

df1 = df.rename_axis(('foo','bar'))
print (df1)
col name     A  B  C  D  E  F
foo     bar                  
Apples  a    5  4  0  5  2  2
Oranges b    5  8  2  5  9  9
Puppies c    7  6  0  7  8  3
Ducks   d    6  5  0  1  6  0

df2 = df.rename_axis('baz', axis=1)
print (df2)
baz                        A  B  C  D  E  F
index name 1 index name 1                  
Apples       a             5  4  0  5  2  2
Oranges      b             5  8  2  5  9  9
Puppies      c             7  6  0  7  8  3
Ducks        d             6  5  0  1  6  0

df2 = df.rename_axis(index=('foo','bar'), columns='baz')
print (df2)
baz          A  B  C  D  E  F
foo     bar                  
Apples  a    5  4  0  5  2  2
Oranges b    5  8  2  5  9  9
Puppies c    7  6  0  7  8  3
Ducks   d    6  5  0  1  6  0

Removing index and columns names means set it to None:

df2 = df.rename_axis(index=(None,None), columns=None)
print (df2)

           A  B  C  D  E  F
Apples  a  6  9  9  5  4  6
Oranges b  2  6  7  4  3  5
Puppies c  6  3  6  3  5  1
Ducks   d  4  9  1  3  0  5

For MultiIndex in index and columns is necessary working with .names instead .name and set by list or tuples:

mux1 = pd.MultiIndex.from_arrays([['Apples', 'Oranges', 'Puppies', 'Ducks'],
                                  list('abcd')], 
                                  names=['index name 1','index name 1'])


mux2 = pd.MultiIndex.from_product([list('ABC'),
                                  list('XY')], 
                                  names=['col name 1','col name 2'])

df = pd.DataFrame(np.random.randint(10, size=(4,6)), index=mux1, columns=mux2)
print (df)
col name 1                 A     B     C   
col name 2                 X  Y  X  Y  X  Y
index name 1 index name 1                  
Apples       a             2  9  4  7  0  3
Oranges      b             9  0  6  0  9  4
Puppies      c             2  4  6  1  4  4
Ducks        d             6  6  7  1  2  8

Plural is necessary for check/set values:

print (df.index.name)
None

print (df.columns.name)
None

print (df.index.names)
['index name 1', 'index name 1']

print (df.columns.names)
['col name 1', 'col name 2']

df1 = df.rename_axis(('foo','bar'))
print (df1)
col name 1   A     B     C   
col name 2   X  Y  X  Y  X  Y
foo     bar                  
Apples  a    2  9  4  7  0  3
Oranges b    9  0  6  0  9  4
Puppies c    2  4  6  1  4  4
Ducks   d    6  6  7  1  2  8

df2 = df.rename_axis(('baz','bak'), axis=1)
print (df2)
baz                        A     B     C   
bak                        X  Y  X  Y  X  Y
index name 1 index name 1                  
Apples       a             2  9  4  7  0  3
Oranges      b             9  0  6  0  9  4
Puppies      c             2  4  6  1  4  4
Ducks        d             6  6  7  1  2  8

df2 = df.rename_axis(index=('foo','bar'), columns=('baz','bak'))
print (df2)
baz          A     B     C   
bak          X  Y  X  Y  X  Y
foo     bar                  
Apples  a    2  9  4  7  0  3
Oranges b    9  0  6  0  9  4
Puppies c    2  4  6  1  4  4
Ducks   d    6  6  7  1  2  8

Removing index and columns names means set it to None:

df2 = df.rename_axis(index=(None,None), columns=(None,None))
print (df2)

           A     B     C   
           X  Y  X  Y  X  Y
Apples  a  2  0  2  5  2  0
Oranges b  1  7  5  5  4  8
Puppies c  2  4  6  3  6  5
Ducks   d  9  6  3  9  7  0

And @Jeff solution:

df.index.names = ['foo','bar']
df.columns.names = ['baz','bak']
print (df)

baz          A     B     C   
bak          X  Y  X  Y  X  Y
foo     bar                  
Apples  a    3  4  7  3  3  3
Oranges b    1  2  5  8  1  0
Puppies c    9  6  3  9  6  3
Ducks   d    3  2  1  0  1  0

回答 2

df.index.name 应该可以。

Python具有dir让您查询对象属性的功能。dir(df.index)在这里很有帮助。

df.index.name should do the trick.

Python has a dir function that let’s you query object attributes. dir(df.index) was helpful here.


回答 3

df.index.rename('foo', inplace=True)来设置索引名。

似乎该API自pandas 0.13起可用。

Use df.index.rename('foo', inplace=True) to set the index name.

Seems this api is available since pandas 0.13.


回答 4

如果您不想创建新行,而只是将其放在空单元格中,请使用:

df.columns.name = 'foo'

否则使用:

df.index.name = 'foo'

If you do not want to create a new row but simply put it in the empty cell then use:

df.columns.name = 'foo'

Otherwise use:

df.index.name = 'foo'

回答 5

df.columns.values 也给我们列名

df.columns.values also give us the column names


回答 6

多索引解决方案位于jezrael的百科全书答案中,但是花了我一段时间才找到它,所以我发布了一个新答案:

df.index.names 给出多索引的名称(作为“冻结列表”)。

The solution for multi-indexes is inside jezrael’s cyclopedic answer, but it took me a while to find it so I am posting a new answer:

df.index.names gives the names of a multi-index (as a Frozenlist).


回答 7

只获取索引列名 df.index.names熊猫的最新版本开始,将对单个Index或MultiIndex都适用。

作为在尝试找到获取索引名+列名列表的最佳方法时发现此问题的人,我会发现此答案很有用:

names = list(filter(None, df.index.names + df.columns.values.tolist()))

这不适用于没有索引,单列索引或多索引。它避免了调用reset_index(),因为这种简单的操作会对性能造成不必要的影响。我很惊讶没有一个内置的方法(我遇到过)。我猜想我经常遇到这种情况,因为我正在从数据帧索引映射到主键/唯一键的数据库中穿梭数据,但实际上这只是我的另一列。

To just get the index column names df.index.names will work for both a single Index or MultiIndex as of the most recent version of pandas.

As someone who found this while trying to find the best way to get a list of index names + column names, I would have found this answer useful:

names = list(filter(None, df.index.names + df.columns.values.tolist()))

This works for no index, single column Index, or MultiIndex. It avoids calling reset_index() which has an unnecessary performance hit for such a simple operation. I’m surprised there isn’t a built in method for this (that I’ve come across). I guess I run into needing this more often because I’m shuttling data from databases where the dataframe index maps to a primary/unique key, but is really just another column to me.


回答 8

设置索引名称也可以在创建时完成:

pd.DataFrame(data={'age': [10,20,30], 'height': [100, 170, 175]}, index=pd.Series(['a', 'b', 'c'], name='Tag'))

Setting the index name can also be accomplished at creation:

pd.DataFrame(data={'age': [10,20,30], 'height': [100, 170, 175]}, index=pd.Series(['a', 'b', 'c'], name='Tag'))