问题:从CSV文件创建字典?
我正在尝试从csv文件创建字典。csv文件的第一列包含唯一键,第二列包含值。csv文件的每一行代表字典中的唯一键,值对。我尝试使用csv.DictReader
和csv.DictWriter
类,但是只能弄清楚如何为每一行生成一个新的字典。我要一部字典。这是我尝试使用的代码:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
for rows in reader:
k = rows[0]
v = rows[1]
mydict = {k:v for k, v in rows}
print(mydict)
当我运行上面的代码时,我得到一个ValueError: too many values to unpack (expected 2)
。如何从csv文件创建一个字典?谢谢。
I am trying to create a dictionary from a csv file. The first column of the csv file contains unique keys and the second column contains values. Each row of the csv file represents a unique key, value pair within the dictionary. I tried to use the csv.DictReader
and csv.DictWriter
classes, but I could only figure out how to generate a new dictionary for each row. I want one dictionary. Here is the code I am trying to use:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
for rows in reader:
k = rows[0]
v = rows[1]
mydict = {k:v for k, v in rows}
print(mydict)
When I run the above code I get a ValueError: too many values to unpack (expected 2)
. How do I create one dictionary from a csv file? Thanks.
回答 0
我相信您正在寻找的语法如下:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = {rows[0]:rows[1] for rows in reader}
或者,对于python <= 2.7.1,您需要:
mydict = dict((rows[0],rows[1]) for rows in reader)
I believe the syntax you were looking for is as follows:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = {rows[0]:rows[1] for rows in reader}
Alternately, for python <= 2.7.1, you want:
mydict = dict((rows[0],rows[1]) for rows in reader)
回答 1
通过依次调用open和打开文件csv.DictReader
。
input_file = csv.DictReader(open("coors.csv"))
您可以通过遍历input_file遍历csv文件dict阅读器对象的行。
for row in input_file:
print(row)
或仅访问第一行
dictobj = csv.DictReader(open('coors.csv')).next()
更新
在python 3+版本中,此代码将有所变化:
reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader)
Open the file by calling open and then csv.DictReader
.
input_file = csv.DictReader(open("coors.csv"))
You may iterate over the rows of the csv file dict reader object by iterating over input_file.
for row in input_file:
print(row)
OR
To access first line only
dictobj = csv.DictReader(open('coors.csv')).next()
UPDATE
In python 3+ versions, this code would change a little:
reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader)
回答 2
import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
k, v = row
d[k] = v
import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
k, v = row
d[k] = v
回答 3
这不是很好,但是使用熊猫的一线解决方案。
import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()
如果要为索引指定dtype(如果由于bug而使用index_col参数,则无法在read_csv中指定该类型):
import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()
This isn’t elegant but a one line solution using pandas.
import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()
If you want to specify dtype for your index (it can’t be specified in read_csv if you use the index_col argument because of a bug):
import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()
回答 4
您只需要将csv.reader转换为dict:
~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3
~ >> cat > d.py
import csv
with open('1.csv') as f:
d = dict(filter(None, csv.reader(f)))
print(d)
~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}
You have to just convert csv.reader to dict:
~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3
~ >> cat > d.py
import csv
with open('1.csv') as f:
d = dict(filter(None, csv.reader(f)))
print(d)
~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}
回答 5
您也可以为此使用numpy。
from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }
You can also use numpy for this.
from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }
回答 6
我建议添加if rows
,以防文件末尾有空行
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = dict(row[:2] for row in reader if row)
I’d suggest adding if rows
in case there is an empty line at the end of the file
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = dict(row[:2] for row in reader if row)
回答 7
一线解决方案
import pandas as pd
dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}
One-liner solution
import pandas as pd
dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}
回答 8
如果可以使用numpy包,则可以执行以下操作:
import numpy as np
lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
my_dict[lines[i][0]] = lines[i][1]
If you are OK with using the numpy package, then you can do something like the following:
import numpy as np
lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
my_dict[lines[i][0]] = lines[i][1]
回答 9
对于简单的csv文件,例如以下内容
id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3
您可以仅使用内置功能将其转换为Python字典
with open(csv_file) as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
csv_dict = {}
for row in data:
key, *values = row
csv_dict[key] = {key: value for key, value in zip(header, values)}
这应该产生以下字典
{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}
注意:Python字典具有唯一键,因此,如果csv文件重复ids
,则应将每行追加到列表中。
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})
For simple csv files, such as the following
id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3
You can convert it to a Python dictionary using only built-ins
with open(csv_file) as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
csv_dict = {}
for row in data:
key, *values = row
csv_dict[key] = {key: value for key, value in zip(header, values)}
This should yield the following dictionary
{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}
Note: Python dictionaries have unique keys, so if your csv file has duplicate ids
you should append each row to a list.
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})
回答 10
您可以使用它,这非常酷:
import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
records, metadata = commas.parse(f)
for row in records:
print 'this is row in dictionary:'+rowenter code here
You can use this, it is pretty cool:
import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
records, metadata = commas.parse(f)
for row in records:
print 'this is row in dictionary:'+rowenter code here
回答 11
已经发布了许多解决方案,我想为我的做出贡献,该解决方案适用于CSV文件中不同数量的列。它创建每列一个键的字典,每个键的值是一个列表,其中包含该列中的元素。
input_file = csv.DictReader(open(path_to_csv_file))
csv_dict = {elem: [] for elem in input_file.fieldnames}
for row in input_file:
for key in csv_dict.keys():
csv_dict[key].append(row[key])
Many solutions have been posted and I’d like to contribute with mine, which works for a different number of columns in the CSV file.
It creates a dictionary with one key per column, and the value for each key is a list with the elements in such column.
input_file = csv.DictReader(open(path_to_csv_file))
csv_dict = {elem: [] for elem in input_file.fieldnames}
for row in input_file:
for key in csv_dict.keys():
csv_dict[key].append(row[key])
回答 12
例如,使用熊猫要容易得多。假设您拥有以下数据作为CSV并将其命名为test.txt
/ test.csv
(您知道CSV是一种文本文件)
a,b,c,d
1,2,3,4
5,6,7,8
现在正在使用熊猫
import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()
对于每一行,
df.to_dict(orient='records')
就是这样。
with pandas, it is much easier, for example.
assuming you have the following data as CSV and let’s call it test.txt
/ test.csv
(you know CSV is a sort of text file )
a,b,c,d
1,2,3,4
5,6,7,8
now using pandas
import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()
for each row, it would be
df.to_dict(orient='records')
and that’s it.
回答 13
尝试使用defaultdict
和DictReader
。
import csv
from collections import defaultdict
my_dict = defaultdict(list)
with open('filename.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
for line in csv_reader:
for key, value in line.items():
my_dict[key].append(value)
它返回:
{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}
Try to use a defaultdict
and DictReader
.
import csv
from collections import defaultdict
my_dict = defaultdict(list)
with open('filename.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
for line in csv_reader:
for key, value in line.items():
my_dict[key].append(value)
It returns:
{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}