First, to convert a Categorical column to its numerical codes, you can do this easier with: dataframe['c'].cat.codes.
Further, it is possible to select automatically all columns with a certain dtype in a dataframe using select_dtypes. This way, you can apply above operation on multiple and automatically selected columns.
First making an example dataframe:
In [75]: df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':list('abcab'), 'col3':list('ababb')})
In [76]: df['col2'] = df['col2'].astype('category')
In [77]: df['col3'] = df['col3'].astype('category')
In [78]: df.dtypes
Out[78]:
col1 int64
col2 category
col3 category
dtype: object
Then by using select_dtypes to select the columns, and then applying .cat.codes on each of these columns, you can get the following result:
Here multiple columns need to be converted. So, one approach i used is ..
for col_name in df.columns:
if(df[col_name].dtype == 'object'):
df[col_name]= df[col_name].astype('category')
df[col_name] = df[col_name].cat.codes
This converts all string / object type columns to categorical. Then applies codes to each type of category.
回答 4
为了转换数据集数据的C列中的分类数据,我们需要执行以下操作:
from sklearn.preprocessing importLabelEncoder
labelencoder=LabelEncoder()#initializing an object of class LabelEncoder
data['C']= labelencoder.fit_transform(data['C'])#fitting and transforming the desired categorical column.
For converting categorical data in column C of dataset data, we need to do the following:
from sklearn.preprocessing import LabelEncoder
labelencoder= LabelEncoder() #initializing an object of class LabelEncoder
data['C'] = labelencoder.fit_transform(data['C']) #fitting and transforming the desired categorical column.
回答 5
@ Quickbeam2k1,请参见下文-
dataset=pd.read_csv('Data2.csv')
np.set_printoptions(threshold=np.nan)
X = dataset.iloc[:,:].values
使用sklearn
from sklearn.preprocessing importLabelEncoder
labelencoder_X=LabelEncoder()
X[:,0]= labelencoder_X.fit_transform(X[:,0])