问题:读取csv时删除熊猫中的索引列
我有以下代码导入CSV文件。有3列,我想将其中的前两个设置为变量。当我将第二列设置为变量“效率”时,索引列也会被添加。如何摆脱索引列?
df = pd.DataFrame.from_csv('Efficiency_Data.csv', header=0, parse_dates=False)
energy = df.index
efficiency = df.Efficiency
print efficiency
我尝试使用
del df['index']
我设置好之后
energy = df.index
我在另一篇文章中找到的,但结果为“ KeyError:’index’”
I have the following code which imports a CSV file. There are 3 columns and I want to set the first two of them to variables. When I set the second column to the variable “efficiency” the index column is also tacked on. How can I get rid of the index column?
df = pd.DataFrame.from_csv('Efficiency_Data.csv', header=0, parse_dates=False)
energy = df.index
efficiency = df.Efficiency
print efficiency
I tried using
del df['index']
after I set
energy = df.index
which I found in another post but that results in “KeyError: ‘index’ “
回答 0
DataFrame和Series始终具有索引。尽管它显示在列旁边,但它不是列,这就是为什么它del df['index']
不起作用的原因。
如果要用简单的序号替换索引,请使用df.reset_index()
。
要了解为什么存在索引以及如何使用该索引,请参阅距熊猫10分钟的信息。
DataFrames and Series always have an index. Although it displays alongside the column(s), it is not a column, which is why del df['index']
did not work.
If you want to replace the index with simple sequential numbers, use df.reset_index()
.
To get a sense for why the index is there and how it is used, see e.g. 10 minutes to Pandas.
回答 1
在读取和读取CSV文件时,请包含参数index=False
,例如:
df.to_csv(filename, index=False)
并从CSV读取
df.read_csv(filename, index=False)
这样可以防止出现此问题,因此您以后无需修复它。
When reading to and from your CSV file include the argument index=False
so for example:
df.to_csv(filename, index=False)
and to read from the csv
df.read_csv(filename, index=False)
This should prevent the issue so you don’t need to fix it later.
回答 2
df.reset_index(drop=True, inplace=True)
df.reset_index(drop=True, inplace=True)
回答 3
您可以将其中一列设置为索引,以防万一它是“ id”。在这种情况下,索引列将替换为您选择的列之一。
df.set_index('id', inplace=True)
You can set one of the columns as an index in case it is an “id” for example.
In this case the index column will be replaced by one of the columns you have chosen.
df.set_index('id', inplace=True)
回答 4
如果您的问题与我的问题相同,则只想将列标题从0重置为列大小。做
df = pd.DataFrame(df.values);
编辑:
如果您具有异构数据类型,则不是一个好主意。更好地使用
df.columns = range(len(df.columns))
If your problem is same as mine where you just want to reset the column headers from 0 to column size. Do
df = pd.DataFrame(df.values);
EDIT:
Not a good idea if you have heterogenous data types. Better just use
df.columns = range(len(df.columns))
回答 5
您可以使用from_csv函数的index_col参数在csv文件中指定哪一列是索引,如果这样做不能解决问题,请提供数据示例
you can specify which column is an index in your csv file by using index_col parameter of from_csv function
if this doesn’t solve you problem please provide example of your data
回答 6
一两件事,我做的是df=df.reset_index()
那么df=df.drop(['index'],axis=1)
One thing that i do is df=df.reset_index()
then df=df.drop(['index'],axis=1)