问题:如何将pandas DataFrame的第一列作为系列?
我试过了:
x=pandas.DataFrame(...)
s = x.take([0], axis=1)
并s
获取一个DataFrame,而不是一个Series。
I tried:
x=pandas.DataFrame(...)
s = x.take([0], axis=1)
And s
gets a DataFrame, not a Series.
回答 0
>>> import pandas as pd
>>> df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]})
>>> df
x y
0 1 4
1 2 5
2 3 6
3 4 7
>>> s = df.ix[:,0]
>>> type(s)
<class 'pandas.core.series.Series'>
>>>
================================================== =========================
更新
如果您在2017年6月之后阅读ix
此书,则熊猫0.20.2已弃用该书,因此请不要使用它。使用loc
或iloc
代替。查看对此问题的评论和其他答案。
>>> import pandas as pd
>>> df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]})
>>> df
x y
0 1 4
1 2 5
2 3 6
3 4 7
>>> s = df.ix[:,0]
>>> type(s)
<class 'pandas.core.series.Series'>
>>>
===========================================================================
UPDATE
If you’re reading this after June 2017, ix
has been deprecated in pandas 0.20.2, so don’t use it. Use loc
or iloc
instead. See comments and other answers to this question.
回答 1
您可以通过以下代码将第一列作为系列:
x[x.columns[0]]
You can get the first column as a Series by following code:
x[x.columns[0]]
回答 2
从v0.11 +开始,…使用df.iloc
。
In [7]: df.iloc[:,0]
Out[7]:
0 1
1 2
2 3
3 4
Name: x, dtype: int64
From v0.11+, … use df.iloc
.
In [7]: df.iloc[:,0]
Out[7]:
0 1
1 2
2 3
3 4
Name: x, dtype: int64
回答 3
这不是最简单的方法吗?
按列名:
In [20]: df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]})
In [21]: df
Out[21]:
x y
0 1 4
1 2 5
2 3 6
3 4 7
In [23]: df.x
Out[23]:
0 1
1 2
2 3
3 4
Name: x, dtype: int64
In [24]: type(df.x)
Out[24]:
pandas.core.series.Series
Isn’t this the simplest way?
By column name:
In [20]: df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]})
In [21]: df
Out[21]:
x y
0 1 4
1 2 5
2 3 6
3 4 7
In [23]: df.x
Out[23]:
0 1
1 2
2 3
3 4
Name: x, dtype: int64
In [24]: type(df.x)
Out[24]:
pandas.core.series.Series
回答 4
当您要从csv文件加载系列时,这非常有用
x = pd.read_csv('x.csv', index_col=False, names=['x'],header=None).iloc[:,0]
print(type(x))
print(x.head(10))
<class 'pandas.core.series.Series'>
0 110.96
1 119.40
2 135.89
3 152.32
4 192.91
5 177.20
6 181.16
7 177.30
8 200.13
9 235.41
Name: x, dtype: float64
This works great when you want to load a series from a csv file
x = pd.read_csv('x.csv', index_col=False, names=['x'],header=None).iloc[:,0]
print(type(x))
print(x.head(10))
<class 'pandas.core.series.Series'>
0 110.96
1 119.40
2 135.89
3 152.32
4 192.91
5 177.20
6 181.16
7 177.30
8 200.13
9 235.41
Name: x, dtype: float64
回答 5
df[df.columns[i]]
其中i
是列的位置/编号(从0开始)。
因此,i = 0
是第一列。
您也可以使用 i = -1
df[df.columns[i]]
where i
is the position/number of the column(starting from 0).
So, i = 0
is for the first column.
You can also get the last column using i = -1