问题:熊猫用空白/空字符串替换NaN
我有一个Pandas Dataframe,如下所示:
1 2 3
0 a NaN read
1 b l unread
2 c NaN read
我想用一个空字符串删除NaN值,使其看起来像这样:
1 2 3
0 a "" read
1 b l unread
2 c "" read
I have a Pandas Dataframe as shown below:
1 2 3
0 a NaN read
1 b l unread
2 c NaN read
I want to remove the NaN values with an empty string so that it looks like so:
1 2 3
0 a "" read
1 b l unread
2 c "" read
回答 0
import numpy as np
df1 = df.replace(np.nan, '', regex=True)
这可能会有所帮助。它将用空字符串替换所有NaN。
import numpy as np
df1 = df.replace(np.nan, '', regex=True)
This might help. It will replace all NaNs with an empty string.
回答 1
df = df.fillna('')
要不就
df.fillna('', inplace=True)
这将用填充na(例如NaN)''
。
如果要填充单个列,则可以使用:
df.column1 = df.column1.fillna('')
可以使用df['column1']
代替df.column1
。
df = df.fillna('')
or just
df.fillna('', inplace=True)
This will fill na’s (e.g. NaN’s) with ''
.
If you want to fill a single column, you can use:
df.column1 = df.column1.fillna('')
One can use df['column1']
instead of df.column1
.
回答 2
如果要从文件(例如CSV或Excel)读取数据帧,请使用:
df.read_csv(path , na_filter=False)
df.read_excel(path , na_filter=False)
这将自动将空字段视为空字符串 ''
如果您已经有了数据框
df = df.replace(np.nan, '', regex=True)
df = df.fillna('')
If you are reading the dataframe from a file (say CSV or Excel) then use :
df.read_csv(path , na_filter=False)
df.read_excel(path , na_filter=False)
This will automatically consider the empty fields as empty strings ''
If you already have the dataframe
df = df.replace(np.nan, '', regex=True)
df = df.fillna('')
回答 3
如果只想格式化它,以使其在打印时呈现良好,请使用格式化程序。只需使用df.to_string(... formatters
即可定义自定义字符串格式,而无需修改您的DataFrame或浪费内存:
df = pd.DataFrame({
'A': ['a', 'b', 'c'],
'B': [np.nan, 1, np.nan],
'C': ['read', 'unread', 'read']})
print df.to_string(
formatters={'B': lambda x: '' if pd.isnull(x) else '{:.0f}'.format(x)})
要得到:
A B C
0 a read
1 b 1 unread
2 c read
Use a formatter, if you only want to format it so that it renders nicely when printed. Just use the df.to_string(... formatters
to define custom string-formatting, without needlessly modifying your DataFrame or wasting memory:
df = pd.DataFrame({
'A': ['a', 'b', 'c'],
'B': [np.nan, 1, np.nan],
'C': ['read', 'unread', 'read']})
print df.to_string(
formatters={'B': lambda x: '' if pd.isnull(x) else '{:.0f}'.format(x)})
To get:
A B C
0 a read
1 b 1 unread
2 c read
回答 4
试试这个,
加 inplace=True
import numpy as np
df.replace(np.NaN, ' ', inplace=True)
Try this,
add inplace=True
import numpy as np
df.replace(np.NaN, ' ', inplace=True)
回答 5
使用keep_default_na=False
应该可以帮助您:
df = pd.read_csv(filename, keep_default_na=False)
using keep_default_na=False
should help you:
df = pd.read_csv(filename, keep_default_na=False)
回答 6
如果您要将DataFrame转换为JSON,NaN
将给出错误,因此在此用例中的最佳解决方案是将替换NaN
为None
。
方法如下:
df1 = df.where((pd.notnull(df)), None)
If you are converting DataFrame to JSON, NaN
will give error so best solution is in this use case is to replace NaN
with None
.
Here is how:
df1 = df.where((pd.notnull(df)), None)
回答 7
我用nan尝试了一列字符串值。
要删除nan并填充空字符串,请执行以下操作:
df.columnname.replace(np.nan,'',regex = True)
要删除nan并填充一些值:
df.columnname.replace(np.nan,'value',regex = True)
我也尝试了df.iloc。但它需要列的索引。所以您需要再次查看表格。简单地,上述方法减少了一个步骤。
I tried with one column of string values with nan.
To remove the nan and fill the empty string:
df.columnname.replace(np.nan,'',regex = True)
To remove the nan and fill some values:
df.columnname.replace(np.nan,'value',regex = True)
I tried df.iloc also. but it needs the index of the column. so you need to look into the table again. simply the above method reduced one step.