问题:如何在Pandas DataFrame中移动列
我想在Pandas中移动一列DataFrame
,但是我无法在不重写整个DF的情况下从文档中找到一种方法来做到这一点。有人知道怎么做吗?数据框:
## x1 x2
##0 206 214
##1 226 234
##2 245 253
##3 265 272
##4 283 291
所需的输出:
## x1 x2
##0 206 nan
##1 226 214
##2 245 234
##3 265 253
##4 283 272
##5 nan 291
I would like to shift a column in a Pandas DataFrame
, but I haven’t been able to find a method to do it from the documentation without rewriting the whole DF. Does anyone know how to do it?
DataFrame:
## x1 x2
##0 206 214
##1 226 234
##2 245 253
##3 265 272
##4 283 291
Desired output:
## x1 x2
##0 206 nan
##1 226 214
##2 245 234
##3 265 253
##4 283 272
##5 nan 291
回答 0
In [18]: a
Out[18]:
x1 x2
0 0 5
1 1 6
2 2 7
3 3 8
4 4 9
In [19]: a.x2 = a.x2.shift(1)
In [20]: a
Out[20]:
x1 x2
0 0 NaN
1 1 5
2 2 6
3 3 7
4 4 8
In [18]: a
Out[18]:
x1 x2
0 0 5
1 1 6
2 2 7
3 3 8
4 4 9
In [19]: a.x2 = a.x2.shift(1)
In [20]: a
Out[20]:
x1 x2
0 0 NaN
1 1 5
2 2 6
3 3 7
4 4 8
回答 1
您需要在df.shift
这里使用。
df.shift(i)
将整个数据帧i
向下移动一个单位。
因此,对于i = 1
:
输入:
x1 x2
0 206 214
1 226 234
2 245 253
3 265 272
4 283 291
输出:
x1 x2
0 Nan Nan
1 206 214
2 226 234
3 245 253
4 265 272
因此,运行此脚本以获取预期的输出:
import pandas as pd
df = pd.DataFrame({'x1': ['206', '226', '245',' 265', '283'],
'x2': ['214', '234', '253', '272', '291']})
print(df)
df['x2'] = df['x2'].shift(1)
print(df)
You need to use df.shift
here.
df.shift(i)
shifts the entire dataframe by i
units down.
So, for i = 1
:
Input:
x1 x2
0 206 214
1 226 234
2 245 253
3 265 272
4 283 291
Output:
x1 x2
0 Nan Nan
1 206 214
2 226 234
3 245 253
4 265 272
So, run this script to get the expected output:
import pandas as pd
df = pd.DataFrame({'x1': ['206', '226', '245',' 265', '283'],
'x2': ['214', '234', '253', '272', '291']})
print(df)
df['x2'] = df['x2'].shift(1)
print(df)
回答 2
让我们通过以下示例定义数据框:
>>> df = pd.DataFrame([[206, 214], [226, 234], [245, 253], [265, 272], [283, 291]],
columns=[1, 2])
>>> df
1 2
0 206 214
1 226 234
2 245 253
3 265 272
4 283 291
然后您可以通过操作第二列的索引
>>> df[2].index = df[2].index+1
最后重新组合单列
>>> pd.concat([df[1], df[2]], axis=1)
1 2
0 206.0 NaN
1 226.0 214.0
2 245.0 234.0
3 265.0 253.0
4 283.0 272.0
5 NaN 291.0
也许不快,但简单易读。考虑为列名和所需的实际移位设置变量。
编辑:通常可以通过df[2].shift(1)
已发布的方式进行转移,但是这会切断结转。
Lets define the dataframe from your example by
>>> df = pd.DataFrame([[206, 214], [226, 234], [245, 253], [265, 272], [283, 291]],
columns=[1, 2])
>>> df
1 2
0 206 214
1 226 234
2 245 253
3 265 272
4 283 291
Then you could manipulate the index of the second column by
>>> df[2].index = df[2].index+1
and finally re-combine the single columns
>>> pd.concat([df[1], df[2]], axis=1)
1 2
0 206.0 NaN
1 226.0 214.0
2 245.0 234.0
3 265.0 253.0
4 283.0 272.0
5 NaN 291.0
Perhaps not fast but simple to read. Consider setting variables for the column names and the actual shift required.
Edit: Generally shifting is possible by df[2].shift(1)
as already posted however would that cut-off the carryover.
回答 3
如果你不想失去你的列转移过去的数据帧的结束,只是首先附加所需数量:
offset = 5
DF = DF.append([np.nan for x in range(offset)])
DF = DF.shift(periods=offset)
DF = DF.reset_index() #Only works if sequential index
If you don’t want to lose the columns you shift past the end of your dataframe, simply append the required number first:
offset = 5
DF = DF.append([np.nan for x in range(offset)])
DF = DF.shift(periods=offset)
DF = DF.reset_index() #Only works if sequential index
回答 4
我想进口
import pandas as pd
import numpy as np
首先NaN, NaN,...
在DataFrame(df
)的末尾添加新行。
s1 = df.iloc[0] # copy 1st row to a new Series s1
s1[:] = np.NaN # set all values to NaN
df2 = df.append(s1, ignore_index=True) # add s1 to the end of df
它将创建新的DF df2。也许有一种更优雅的方式,但这可行。
现在您可以移动它:
df2.x2 = df2.x2.shift(1) # shift what you want
I suppose imports
import pandas as pd
import numpy as np
First append new row with NaN, NaN,...
at the end of DataFrame (df
).
s1 = df.iloc[0] # copy 1st row to a new Series s1
s1[:] = np.NaN # set all values to NaN
df2 = df.append(s1, ignore_index=True) # add s1 to the end of df
It will create new DF df2. Maybe there is more elegant way but this works.
Now you can shift it:
df2.x2 = df2.x2.shift(1) # shift what you want
回答 5
尝试回答一个个人问题,并且与您在Pandas Doc上发现的问题类似,我认为可以回答这个问题:
DataFrame.shift(周期= 1,频率=无,轴= 0)按所需的周期数移动索引,并具有可选的时间频率
笔记
如果指定了freq,则索引值会移位,但数据不会重新对齐。也就是说,如果您想在移位时扩展索引并保留原始数据,请使用freq。
希望对以后的问题有所帮助。
Trying to answer a personal problem and similar to yours I found on Pandas Doc what I think would answer this question:
DataFrame.shift(periods=1, freq=None, axis=0)
Shift index by desired number of periods with an optional time freq
Notes
If freq is specified then the index values are shifted but the data is not realigned. That is, use freq if you would like to extend the index when shifting and preserve the original data.
Hope to help future questions in this matter.
回答 6
这是我的方法:
df_ext = pd.DataFrame(index=pd.date_range(df.index[-1], periods=8, closed='right'))
df2 = pd.concat([df, df_ext], axis=0, sort=True)
df2["forecast"] = df2["some column"].shift(7)
基本上,我正在生成具有所需索引的空数据框,然后将它们连接在一起。但是我真的很想将此作为熊猫的标准功能,因此我提出了对熊猫的增强功能。
This is how I do it:
df_ext = pd.DataFrame(index=pd.date_range(df.index[-1], periods=8, closed='right'))
df2 = pd.concat([df, df_ext], axis=0, sort=True)
df2["forecast"] = df2["some column"].shift(7)
Basically I am generating an empty dataframe with the desired index and then just concatenate them together. But I would really like to see this as a standard feature in pandas so I have proposed an enhancement to pandas.