问题:熊猫数据框fillna()仅存在一些列

我试图只对某些列子集用0填充Pandas数据框中的任何值。

当我做:

import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df

输出:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  NaN  7.0
3  NaN  6.0  8.0
     a    b    c
0  1.0  4.0  0.0
1  2.0  5.0  0.0
2  3.0  0.0  7.0
3  0.0  6.0  8.0

它取代了每一个None0的。我想要做的是,只有更换NoneS IN列ab,但不会c

最好的方法是什么?

I am trying to fill none values in a Pandas dataframe with 0’s for only some subset of columns.

When I do:

import pandas as pd
df = pd.DataFrame(data={'a':[1,2,3,None],'b':[4,5,None,6],'c':[None,None,7,8]})
print df
df.fillna(value=0, inplace=True)
print df

The output:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  NaN  7.0
3  NaN  6.0  8.0
     a    b    c
0  1.0  4.0  0.0
1  2.0  5.0  0.0
2  3.0  0.0  7.0
3  0.0  6.0  8.0

It replaces every None with 0‘s. What I want to do is, only replace Nones in columns a and b, but not c.

What is the best way of doing this?


回答 0

您可以选择所需的列并通过分配来完成:

df[['a', 'b']] = df[['a','b']].fillna(value=0)

结果输出与预期的一样:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

You can select your desired columns and do it by assignment:

df[['a', 'b']] = df[['a','b']].fillna(value=0)

The resulting output is as expected:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

回答 1

您可以使用dictfillna与不同的列不同的价值

df.fillna({'a':0,'b':0})
Out[829]: 
     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

分配回去之后

df=df.fillna({'a':0,'b':0})
df
Out[831]: 
     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

You can using dict , fillna with different value for different column

df.fillna({'a':0,'b':0})
Out[829]: 
     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

After assign it back

df=df.fillna({'a':0,'b':0})
df
Out[831]: 
     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

回答 2

您可以避免使用Wen的解决方案和inplace = True复制对象:

df.fillna({'a':0, 'b':0}, inplace=True)
print(df)

生成:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

You can avoid making a copy of the object using Wen’s solution and inplace=True:

df.fillna({'a':0, 'b':0}, inplace=True)
print(df)

Which yields:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

回答 3

这是您可以在一行中完成所有操作的方法:

df[['a', 'b']].fillna(value=0, inplace=True)

细分:df[['a', 'b']]选择要为其填充NaN值的列,value=0告诉它为NaN填充零,inplace=True并使更改永久生效,而无需复制该对象。

Here’s how you can do it all in one line:

df[['a', 'b']].fillna(value=0, inplace=True)

Breakdown: df[['a', 'b']] selects the columns you want to fill NaN values for, value=0 tells it to fill NaNs with zero, and inplace=True will make the changes permanent, without having to make a copy of the object.


回答 4

使用最上面的答案会产生有关更改df切片副本的警告。假设您还有其他列,执行此操作的更好方法是传递字典:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)

using the top answer produces a warning about making changes to a copy of a df slice. Assuming that you have other columns, a better way to do this is to pass a dictionary:
df.fillna({'A': 'NA', 'B': 'NA'}, inplace=True)


回答 5

或类似的东西:

df.loc[df['a'].isnull(),'a']=0
df.loc[df['b'].isnull(),'b']=0

如果还有更多:

for i in your_list:
    df.loc[df[i].isnull(),i]=0

Or something like:

df.loc[df['a'].isnull(),'a']=0
df.loc[df['b'].isnull(),'b']=0

and if there is more:

for i in your_list:
    df.loc[df[i].isnull(),i]=0

回答 6

有时,此语法无法正常工作:

df[['col1','col2']] = df[['col1','col2']].fillna()

请改用以下内容:

df['col1','col2']

Sometimes this syntax wont work:

df[['col1','col2']] = df[['col1','col2']].fillna()

Use the following instead:

df['col1','col2']

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。