问题:将pandas数据框中的列从int转换为string
我在pandas中有一个数据帧,其中包含int和str数据列。我想先串联数据框内的列。为此,我必须将int
列转换为str
。我尝试做如下:
mtrx['X.3'] = mtrx.to_string(columns = ['X.3'])
要么
mtrx['X.3'] = mtrx['X.3'].astype(str)
但是在两种情况下都无法正常工作,并且我收到一条错误消息:“无法连接’str’和’int’对象”。连接两str
列效果很好。
I have a dataframe in pandas with mixed int and str data columns. I want to concatenate first the columns within the dataframe. To do that I have to convert an int
column to str
.
I’ve tried to do as follows:
mtrx['X.3'] = mtrx.to_string(columns = ['X.3'])
or
mtrx['X.3'] = mtrx['X.3'].astype(str)
but in both cases it’s not working and I’m getting an error saying “cannot concatenate ‘str’ and ‘int’ objects”. Concatenating two str
columns is working perfectly fine.
回答 0
In [16]: df = DataFrame(np.arange(10).reshape(5,2),columns=list('AB'))
In [17]: df
Out[17]:
A B
0 0 1
1 2 3
2 4 5
3 6 7
4 8 9
In [18]: df.dtypes
Out[18]:
A int64
B int64
dtype: object
转换系列
In [19]: df['A'].apply(str)
Out[19]:
0 0
1 2
2 4
3 6
4 8
Name: A, dtype: object
In [20]: df['A'].apply(str)[0]
Out[20]: '0'
不要忘记将结果分配回去:
df['A'] = df['A'].apply(str)
转换整个框架
In [21]: df.applymap(str)
Out[21]:
A B
0 0 1
1 2 3
2 4 5
3 6 7
4 8 9
In [22]: df.applymap(str).iloc[0,0]
Out[22]: '0'
df = df.applymap(str)
In [16]: df = DataFrame(np.arange(10).reshape(5,2),columns=list('AB'))
In [17]: df
Out[17]:
A B
0 0 1
1 2 3
2 4 5
3 6 7
4 8 9
In [18]: df.dtypes
Out[18]:
A int64
B int64
dtype: object
Convert a series
In [19]: df['A'].apply(str)
Out[19]:
0 0
1 2
2 4
3 6
4 8
Name: A, dtype: object
In [20]: df['A'].apply(str)[0]
Out[20]: '0'
Don’t forget to assign the result back:
df['A'] = df['A'].apply(str)
Convert the whole frame
In [21]: df.applymap(str)
Out[21]:
A B
0 0 1
1 2 3
2 4 5
3 6 7
4 8 9
In [22]: df.applymap(str).iloc[0,0]
Out[22]: '0'
df = df.applymap(str)
回答 1
更改DataFrame列的数据类型:
要诠释:
df.column_name = df.column_name.astype(np.int64)
要str:
df.column_name = df.column_name.astype(str)
Change data type of DataFrame column:
To int:
df.column_name = df.column_name.astype(np.int64)
To str:
df.column_name = df.column_name.astype(str)
回答 2
警告:给定的两个解决方案( astype()和apply())都不以nan或None形式保留NULL值。
import pandas as pd
import numpy as np
df = pd.DataFrame([None,'string',np.nan,42], index=[0,1,2,3], columns=['A'])
df1 = df['A'].astype(str)
df2 = df['A'].apply(str)
print df.isnull()
print df1.isnull()
print df2.isnull()
我相信这是由to_string()的实现解决的
Warning: Both solutions given ( astype() and apply() ) do not preserve NULL values in either the nan or the None form.
import pandas as pd
import numpy as np
df = pd.DataFrame([None,'string',np.nan,42], index=[0,1,2,3], columns=['A'])
df1 = df['A'].astype(str)
df2 = df['A'].apply(str)
print df.isnull()
print df1.isnull()
print df2.isnull()
I believe this is fixed by the implementation of to_string()
回答 3
使用以下代码:
df.column_name = df.column_name.astype('str')
Use the following code:
df.column_name = df.column_name.astype('str')
回答 4
仅供参考。
以上所有答案均适用于数据帧的情况。但是,如果您在创建/修改列时使用lambda,则此方法将不起作用,因为在那里将其视为int属性而不是pandas系列。您必须使用str(target_attribute)使其成为字符串。请参考以下示例。
def add_zero_in_prefix(df):
if(df['Hour']<10):
return '0' + str(df['Hour'])
data['str_hr'] = data.apply(add_zero_in_prefix, axis=1)
Just for an additional reference.
All of the above answers will work in case of a data frame. But if you are using lambda while creating / modify a column this won’t work, Because there it is considered as a int attribute instead of pandas series. You have to use str( target_attribute ) to make it as a string. Please refer the below example.
def add_zero_in_prefix(df):
if(df['Hour']<10):
return '0' + str(df['Hour'])
data['str_hr'] = data.apply(add_zero_in_prefix, axis=1)