问题:熊猫数据框fillna()仅存在一些列
我试图只对某些列子集用0填充Pandas数据框中的任何值。
当我做:
import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df
输出:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 NaN 7.0
3 NaN 6.0 8.0
a b c
0 1.0 4.0 0.0
1 2.0 5.0 0.0
2 3.0 0.0 7.0
3 0.0 6.0 8.0
它取代了每一个None
用0
的。我想要做的是,只有更换None
S IN列a
和b
,但不会c
。
最好的方法是什么?
I am trying to fill none values in a Pandas dataframe with 0’s for only some subset of columns.
When I do:
import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df
The output:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 NaN 7.0
3 NaN 6.0 8.0
a b c
0 1.0 4.0 0.0
1 2.0 5.0 0.0
2 3.0 0.0 7.0
3 0.0 6.0 8.0
It replaces every None
with 0
‘s. What I want to do is, only replace None
s in columns a
and b
, but not c
.
What is the best way of doing this?
回答 0
您可以选择所需的列并通过分配来完成:
df[['a', 'b']] = df[['a','b']].fillna(value=0)
结果输出与预期的一样:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
You can select your desired columns and do it by assignment:
df[['a', 'b']] = df[['a','b']].fillna(value=0)
The resulting output is as expected:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
回答 1
您可以使用dict
,fillna
与不同的列不同的价值
df.fillna({'a':0,'b':0})
Out[829]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
分配回去之后
df=df.fillna({'a':0,'b':0})
df
Out[831]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
You can using dict
, fillna
with different value for different column
df.fillna({'a':0,'b':0})
Out[829]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
After assign it back
df=df.fillna({'a':0,'b':0})
df
Out[831]:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
回答 2
您可以避免使用Wen的解决方案和inplace = True复制对象:
df.fillna({'a':0, 'b':0}, inplace=True)
print(df)
生成:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
You can avoid making a copy of the object using Wen’s solution and inplace=True:
df.fillna({'a':0, 'b':0}, inplace=True)
print(df)
Which yields:
a b c
0 1.0 4.0 NaN
1 2.0 5.0 NaN
2 3.0 0.0 7.0
3 0.0 6.0 8.0
回答 3
这是您可以在一行中完成所有操作的方法:
df[['a', 'b']].fillna(value=0, inplace=True)
细分:df[['a', 'b']]
选择要为其填充NaN值的列,value=0
告诉它为NaN填充零,inplace=True
并使更改永久生效,而无需复制该对象。
Here’s how you can do it all in one line:
df[['a', 'b']].fillna(value=0, inplace=True)
Breakdown: df[['a', 'b']]
selects the columns you want to fill NaN values for, value=0
tells it to fill NaNs with zero, and inplace=True
will make the changes permanent, without having to make a copy of the object.
回答 4
使用最上面的答案会产生有关更改df切片副本的警告。假设您还有其他列,执行此操作的更好方法是传递字典:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)
using the top answer produces a warning about making changes to a copy of a df slice. Assuming that you have other columns, a better way to do this is to pass a dictionary:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)
回答 5
或类似的东西:
df.loc[df['a'].isnull(),'a']=0
df.loc[df['b'].isnull(),'b']=0
如果还有更多:
for i in your_list:
df.loc[df[i].isnull(),i]=0
Or something like:
df.loc[df['a'].isnull(),'a']=0
df.loc[df['b'].isnull(),'b']=0
and if there is more:
for i in your_list:
df.loc[df[i].isnull(),i]=0
回答 6
有时,此语法无法正常工作:
df[['col1','col2']] = df[['col1','col2']].fillna()
请改用以下内容:
df['col1','col2']
Sometimes this syntax wont work:
df[['col1','col2']] = df[['col1','col2']].fillna()
Use the following instead:
df['col1','col2']