问题:CSV新行字符出现在未引用字段错误
以下代码一直工作到今天,当我从Windows机器导入并出现此错误时:
在不带引号的字段中看到换行符-您是否需要在通用换行模式下打开文件?
import csv
class CSV:
def __init__(self, file=None):
self.file = file
def read_file(self):
data = []
file_read = csv.reader(self.file)
for row in file_read:
data.append(row)
return data
def get_row_count(self):
return len(self.read_file())
def get_column_count(self):
new_data = self.read_file()
return len(new_data[0])
def get_data(self, rows=1):
data = self.read_file()
return data[:rows]
如何解决此问题?
def upload_configurator(request, id=None):
"""
A view that allows the user to configurator the uploaded CSV.
"""
upload = Upload.objects.get(id=id)
csvobject = CSV(upload.filepath)
upload.num_records = csvobject.get_row_count()
upload.num_columns = csvobject.get_column_count()
upload.save()
form = ConfiguratorForm()
row_count = csvobject.get_row_count()
colum_count = csvobject.get_column_count()
first_row = csvobject.get_data(rows=1)
first_two_rows = csvobject.get_data(rows=5)
the following code worked until today when I imported from a Windows machine and got this error:
new-line character seen in unquoted field – do you need to open the file in universal-newline mode?
import csv
class CSV:
def __init__(self, file=None):
self.file = file
def read_file(self):
data = []
file_read = csv.reader(self.file)
for row in file_read:
data.append(row)
return data
def get_row_count(self):
return len(self.read_file())
def get_column_count(self):
new_data = self.read_file()
return len(new_data[0])
def get_data(self, rows=1):
data = self.read_file()
return data[:rows]
How can I fix this issue?
def upload_configurator(request, id=None):
"""
A view that allows the user to configurator the uploaded CSV.
"""
upload = Upload.objects.get(id=id)
csvobject = CSV(upload.filepath)
upload.num_records = csvobject.get_row_count()
upload.num_columns = csvobject.get_column_count()
upload.save()
form = ConfiguratorForm()
row_count = csvobject.get_row_count()
colum_count = csvobject.get_column_count()
first_row = csvobject.get_data(rows=1)
first_two_rows = csvobject.get_data(rows=5)
回答 0
最好先查看csv文件本身,但这可能对您有用,请尝试一下,替换:
file_read = csv.reader(self.file)
与:
file_read = csv.reader(self.file, dialect=csv.excel_tab)
或者,使用打开文件universal newline mode
并将其传递给csv.reader
,例如:
reader = csv.reader(open(self.file, 'rU'), dialect=csv.excel_tab)
或者,splitlines()
像这样使用:
def read_file(self):
with open(self.file, 'r') as f:
data = [row for row in csv.reader(f.read().splitlines())]
return data
It’ll be good to see the csv file itself, but this might work for you, give it a try, replace:
file_read = csv.reader(self.file)
with:
file_read = csv.reader(self.file, dialect=csv.excel_tab)
Or, open a file with universal newline mode
and pass it to csv.reader
, like:
reader = csv.reader(open(self.file, 'rU'), dialect=csv.excel_tab)
Or, use splitlines()
, like this:
def read_file(self):
with open(self.file, 'r') as f:
data = [row for row in csv.reader(f.read().splitlines())]
return data
回答 1
我意识到这是一篇过时的文章,但是遇到了同样的问题,但没有找到正确的答案,因此我将尝试一下
Python错误:
_csv.Error: new-line character seen in unquoted field
试图读取Macintosh(OS X之前的格式)的CSV文件引起的。这些是使用CR作为行尾的文本文件。如果使用MS Office,请确保选择纯CSV格式或CSV(MS-DOS)。不要使用CSV(Macintosh)作为另存为类型。
我首选的EOL版本是LF(Unix / Linux / Apple),但我不认为MS Office提供了以这种格式保存的选项。
I realize this is an old post, but I ran into the same problem and don’t see the correct answer so I will give it a try
Python Error:
_csv.Error: new-line character seen in unquoted field
Caused by trying to read Macintosh (pre OS X formatted) CSV files. These are text files that use CR for end of line. If using MS Office make sure you select either plain CSV format or CSV (MS-DOS). Do not use CSV (Macintosh) as save-as type.
My preferred EOL version would be LF (Unix/Linux/Apple), but I don’t think MS Office provides the option to save in this format.
回答 2
对于Mac OS X,请以“ Windows逗号分隔(.csv)”格式保存CSV文件。
For Mac OS X, save your CSV file in “Windows Comma Separated (.csv)” format.
回答 3
如果您在Mac上遇到了这种情况(就像对我一样):
- 将文件另存为
CSV (MS-DOS Comma-Separated)
运行以下脚本
with open(csv_filename, 'rU') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print ', '.join(row)
If this happens to you on mac (as it did to me):
- Save the file as
CSV (MS-DOS Comma-Separated)
Run the following script
with open(csv_filename, 'rU') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print ', '.join(row)
回答 4
尝试先dos2unix
在Windows导入的文件上运行
Try to run dos2unix
on your windows imported files first
回答 5
这是我遇到的错误。我已将.csv文件保存在MAC OSX中。
保存时,将其另存为“ Windows逗号分隔值(.csv)”,此问题已解决。
This is an error that I faced. I had saved .csv file in MAC OSX.
While saving, save it as “Windows Comma Separated Values (.csv)” which resolved the issue.
回答 6
这在OSX上对我有用。
# allow variable to opened as files
from io import StringIO
# library to map other strange (accented) characters back into UTF-8
from unidecode import unidecode
# cleanse input file with Windows formating to plain UTF-8 string
with open(filename, 'rb') as fID:
uncleansedBytes = fID.read()
# decode the file using the correct encoding scheme
# (probably this old windows one)
uncleansedText = uncleansedBytes.decode('Windows-1252')
# replace carriage-returns with new-lines
cleansedText = uncleansedText.replace('\r', '\n')
# map any other non UTF-8 characters into UTF-8
asciiText = unidecode(cleansedText)
# read each line of the csv file and store as an array of dicts,
# use first line as field names for each dict.
reader = csv.DictReader(StringIO(cleansedText))
for line_entry in reader:
# do something with your read data
This worked for me on OSX.
# allow variable to opened as files
from io import StringIO
# library to map other strange (accented) characters back into UTF-8
from unidecode import unidecode
# cleanse input file with Windows formating to plain UTF-8 string
with open(filename, 'rb') as fID:
uncleansedBytes = fID.read()
# decode the file using the correct encoding scheme
# (probably this old windows one)
uncleansedText = uncleansedBytes.decode('Windows-1252')
# replace carriage-returns with new-lines
cleansedText = uncleansedText.replace('\r', '\n')
# map any other non UTF-8 characters into UTF-8
asciiText = unidecode(cleansedText)
# read each line of the csv file and store as an array of dicts,
# use first line as field names for each dict.
reader = csv.DictReader(StringIO(cleansedText))
for line_entry in reader:
# do something with your read data
回答 7
我知道这个问题已经回答了很长时间,但并不能解决我的问题。由于其他一些复杂性,我正在使用DictReader和StringIO进行csv读取。通过显式替换定界符,我能够更简单地解决问题:
with urllib.request.urlopen(q) as response:
raw_data = response.read()
encoding = response.info().get_content_charset('utf8')
data = raw_data.decode(encoding)
if '\r\n' not in data:
# proably a windows delimited thing...try to update it
data = data.replace('\r', '\r\n')
对于庞大的CSV文件来说可能并不合理,但对于我的用例来说效果很好。
I know this has been answered for quite some time but not solve my problem. I am using DictReader and StringIO for my csv reading due to some other complications. I was able to solve problem more simply by replacing delimiters explicitly:
with urllib.request.urlopen(q) as response:
raw_data = response.read()
encoding = response.info().get_content_charset('utf8')
data = raw_data.decode(encoding)
if '\r\n' not in data:
# proably a windows delimited thing...try to update it
data = data.replace('\r', '\r\n')
Might not be reasonable for enormous CSV files, but worked well for my use case.
回答 8
替代快速解决方案:我遇到了同样的错误。我在lubuntu机器上的GNUMERIC中重新打开了“奇怪的” csv文件,并将该文件导出为csv文件。这解决了问题。
Alternative and fast solution : I faced the same error. I reopened the “wierd” csv file in GNUMERIC on my lubuntu machine and exported the file as csv file. This corrected the issue.