when I try to .apply a function to the Amount column I get the following error.
ValueError: cannot convert float NaN to integer
I have tried applying a function using .isnan from the Math Module
I have tried the pandas .replace attribute
I tried the .sparse data attribute from pandas 0.9
I have also tried if NaN == NaN statement in a function.
I have also looked at this article How do I replace NA values with zeros in an R dataframe? whilst looking at some other articles.
All the methods I have tried have not worked or do not recognise NaN.
Any Hints or solutions would be appreciated.
I just wanted to provide a bit of an update/special case since it looks like people still come here. If you’re using a multi-index or otherwise using an index-slicer the inplace=True option may not be enough to update the slice you’ve chosen. For example in a 2×2 level multi-index this will not change any values (as of pandas 0.15):
The “problem” is that the chaining breaks the fillna ability to update the original dataframe. I put “problem” in quotes because there are good reasons for the design decisions that led to not interpreting through these chains in certain situations. Also, this is a complex example (though I really ran into it), but the same may apply to fewer levels of indexes depending on how you slice.
It’s one line, reads reasonably well (sort of) and eliminates any unnecessary messing with intermediate variables or loops while allowing you to apply fillna to any multi-level slice you like!
If anybody can find places this doesn’t work please post in the comments, I’ve been messing with it and looking at the source and it seems to solve at least my multi-index slice problems.
import pandas as pd
df = pd.read_excel('example.xlsx')
df.fillna({'column1':'Write your values here','column2':'Write your values here','column3':'Write your values here','column4':'Write your values here',...'column-n':'Write your values here'}, inplace=True)
value:标量,dict,Series或DataFrame用于填充孔的值(例如0),或者是dict / Series / DataFrame的值,这些值指定每个索引(对于Series)或列(对于DataFrame)使用哪个值。(不在dict / Series / DataFrame中的值将不被填充)。该值不能是列表。
这意味着不再允许对“字符串”或“常量”进行插补。
对于更专业的插补,请使用SimpleImputer():
from sklearn.impute importSimpleImputer
si =SimpleImputer(strategy='constant', missing_values=np.nan, fill_value='Replacement_Value')
df[['Col-1','Col-2']]= si.fit_transform(X=df[['C-1','C-2']])
There are two options available primarily; in case of imputation or filling of missing values NaN / np.nan with only numerical replacements (across column(s):
df['Amount'].fillna(value=None, method= ,axis=1,) is sufficient:
From the Documentation:
value : scalar, dict, Series, or DataFrame
Value to use to fill holes (e.g. 0), alternately a
dict/Series/DataFrame of values specifying which value to use for
each index (for a Series) or column (for a DataFrame). (values not
in the dict/Series/DataFrame will not be filled). This value cannot
be a list.
Which means ‘strings’ or ‘constants’ are no longer permissable to be imputed.
For more specialized imputations use SimpleImputer():
from sklearn.impute import SimpleImputer
si = SimpleImputer(strategy='constant', missing_values=np.nan, fill_value='Replacement_Value')
df[['Col-1', 'Col-2']] = si.fit_transform(X=df[['C-1', 'C-2']])