问题:如何将标题行添加到Pandas DataFrame

我正在将csv文件读入pandas。此csv文件由四列和一些行组成,但没有要添加的标题行。我一直在尝试以下方法:

Cov = pd.read_csv("path/to/file.txt", sep='\t')
Frame=pd.DataFrame([Cov], columns = ["Sequence", "Start", "End", "Coverage"])
Frame.to_csv("path/to/file.txt", sep='\t')

但是,当我应用代码时,出现以下错误:

ValueError: Shape of passed values is (1, 1), indices imply (4, 1)

错误的确切含义是什么?在python中将标题行添加到csv文件/ pandas df的一种干净方法是什么?

I am reading a csv file into pandas. This csv file constists of four columns and some rows, but does not have a header row, which I want to add. I have been trying the following:

Cov = pd.read_csv("path/to/file.txt", sep='\t')
Frame=pd.DataFrame([Cov], columns = ["Sequence", "Start", "End", "Coverage"])
Frame.to_csv("path/to/file.txt", sep='\t')

But when I apply the code, I get the following Error:

ValueError: Shape of passed values is (1, 1), indices imply (4, 1)

What exactly does the error mean? And what would be a clean way in python to add a header row to my csv file/pandas df?


回答 0

您可以names直接在read_csv

names:类似数组,默认为None要使用的列名列表。如果文件不包含标题行,则应显式传递header = None

Cov = pd.read_csv("path/to/file.txt", 
                  sep='\t', 
                  names=["Sequence", "Start", "End", "Coverage"])

You can use names directly in the read_csv

names : array-like, default None List of column names to use. If file contains no header row, then you should explicitly pass header=None

Cov = pd.read_csv("path/to/file.txt", 
                  sep='\t', 
                  names=["Sequence", "Start", "End", "Coverage"])

回答 1

另外,您可以阅读csv,header=None然后添加df.columns

Cov = pd.read_csv("path/to/file.txt", sep='\t', header=None)
Cov.columns = ["Sequence", "Start", "End", "Coverage"]

Alternatively you could read you csv with header=None and then add it with df.columns:

Cov = pd.read_csv("path/to/file.txt", sep='\t', header=None)
Cov.columns = ["Sequence", "Start", "End", "Coverage"]

回答 2

col_Names=["Sequence", "Start", "End", "Coverage"]
my_CSV_File= pd.read_csv("yourCSVFile.csv",names=col_Names)

完成此操作后,只需进行检查[显然,我知道,你知道。但是…

my_CSV_File.head()

希望对您有帮助…干杯

col_Names=["Sequence", "Start", "End", "Coverage"]
my_CSV_File= pd.read_csv("yourCSVFile.csv",names=col_Names)

having done this, just check it with[well obviously I know, u know that. But still…

my_CSV_File.head()

Hope it helps … Cheers


回答 3

要修复代码,您只需将更[Cov]改为Cov.values,第一个参数pd.DataFrame将变为多维numpy数组:

Cov = pd.read_csv("path/to/file.txt", sep='\t')
Frame=pd.DataFrame(Cov.values, columns = ["Sequence", "Start", "End", "Coverage"])
Frame.to_csv("path/to/file.txt", sep='\t')

但最聪明的解决方案仍然是pd.read_excelheader=None和一起使用names=columns_list

To fix your code you can simply change [Cov] to Cov.values, the first parameter of pd.DataFrame will become a multi-dimensional numpy array:

Cov = pd.read_csv("path/to/file.txt", sep='\t')
Frame=pd.DataFrame(Cov.values, columns = ["Sequence", "Start", "End", "Coverage"])
Frame.to_csv("path/to/file.txt", sep='\t')

But the smartest solution still is use pd.read_excel with header=None and names=columns_list.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。