Python将csv导入列表

Question 1

I have a CSV file with about 2000 records.

Each record has a string, and a category to it:

This is the first line,Line1
This is the second line,Line2
This is the third line,Line3

I need to read this file into a list that looks like this:

data = [('This is the first line', 'Line1'),
        ('This is the second line', 'Line2'),
        ('This is the third line', 'Line3')]

How can import this CSV to the list I need using Python?

Question 2

Using the csv module:

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    data = list(reader)

print(data)

Output:

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]

If you need tuples:

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    data = [tuple(row) for row in reader]

print(data)

Output:

[('This is the first line', 'Line1'), ('This is the second line', 'Line2'), ('This is the third line', 'Line3')]

Old Python 2 answer, also using the csv module:

import csv
with open('file.csv', 'rb') as f:
    reader = csv.reader(f)
    your_list = list(reader)

print your_list
# [['This is the first line', 'Line1'],
#  ['This is the second line', 'Line2'],
#  ['This is the third line', 'Line3']]

Question 3

Updated for Python 3:

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    your_list = list(reader)

print(your_list)

Output:

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]

Question 4

Pandas is pretty good at dealing with data. Here is one example how to use it:

import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('filename.csv', delimiter=',')

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

One big advantage is that pandas deals automatically with header rows.

If you haven’t heard of Seaborn, I recommend having a look at it.

See also: How do I read and write CSV files with Python?

Pandas #2

import pandas as pd

# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()

# Convert
dicts = df.to_dict('records')

The content of df is:

     country   population population_time    EUR
0    Germany   82521653.0      2016-12-01   True
1     France   66991000.0      2017-01-01   True
2  Indonesia  255461700.0      2017-01-01  False
3    Ireland    4761865.0             NaT   True
4      Spain   46549045.0      2017-06-01   True
5    Vatican          NaN             NaT   True

The content of dicts is

[{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
 {'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
 {'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
 {'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
 {'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
 {'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]

Pandas #3

import pandas as pd

# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()

# Convert
lists = [[row[col] for col in df.columns] for row in df.to_dict('records')]

The content of lists is:

[['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
 ['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
 ['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
 ['Ireland', 4761865.0, NaT, True],
 ['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
 ['Vatican', nan, NaT, True]]

Question 5

Update for Python3:

import csv
from pprint import pprint

with open('text.csv', newline='') as file:
    reader = csv.reader(file)
    res = list(map(tuple, reader))

pprint(res)

Output:

[('This is the first line', ' Line1'),
 ('This is the second line', ' Line2'),
 ('This is the third line', ' Line3')]

If csvfile is a file object, it should be opened with newline=''.
csv module

Question 6

If you are sure there are no commas in your input, other than to separate the category, you can read the file line by line and split on ,, then push the result to List

That said, it looks like you are looking at a CSV file, so you might consider using the modules for it

Question 7

result = []
for line in text.splitlines():
    result.append(tuple(line.split(",")))

Question 8

As said already in the comments you can use the csv library in python. csv means comma separated values which seems exactly your case: a label and a value separated by a comma.

Being a category and value type I would rather use a dictionary type instead of a list of tuples.

Anyway in the code below I show both ways: d is the dictionary and l is the list of tuples.

import csv

file_name = "test.txt"
try:
    csvfile = open(file_name, 'rt')
except:
    print("File not found")
csvReader = csv.reader(csvfile, delimiter=",")
d = dict()
l =  list()
for row in csvReader:
    d[row[1]] = row[0]
    l.append((row[0], row[1]))
print(d)
print(l)

Question 9

A simple loop would suffice:

lines = []
with open('test.txt', 'r') as f:
    for line in f.readlines():
        l,name = line.strip().split(',')
        lines.append((l,name))

print lines

Question 10

Unfortunately I find none of the existing answers particularly satisfying.

Here is a straightforward and complete Python 3 solution, using the csv module.

import csv

with open('../resources/temp_in.csv', newline='') as f:
    reader = csv.reader(f, skipinitialspace=True)
    rows = list(reader)

print(rows)

Notice the skipinitialspace=True argument. This is necessary since, unfortunately, OP’s CSV contains whitespace after each comma.

Output:

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]

Question 11

Extending your requirements a bit and assuming you do not care about the order of lines and want to get them grouped under categories, the following solution may work for you:

>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
...     for line in f:
...         text, cat = line.rstrip("\n").split(",", 1)
...         dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})

This way you get all relevant lines available in the dictionary under key being the category.

Question 12

Here is the easiest way in Python 3.x to import a CSV to a multidimensional array, and its only 4 lines of code without importing anything!

#pull a CSV into a multidimensional array in 4 lines!

L=[]                            #Create an empty list for the main array
for line in open('log.txt'):    #Open the file and read all the lines
    x=line.rstrip()             #Strip the \n from each line
    L.append(x.split(','))      #Split each line into a list and add it to the
                                #Multidimensional array
print(L)

Question 13

Next is a piece of code which uses csv module but extracts file.csv contents to a list of dicts using the first line which is a header of csv table

import csv
def csv2dicts(filename):
  with open(filename, 'rb') as f:
    reader = csv.reader(f)
    lines = list(reader)
    if len(lines) < 2: return None
    names = lines[0]
    if len(names) < 1: return None
    dicts = []
    for values in lines[1:]:
      if len(values) != len(names): return None
      d = {}
      for i,_ in enumerate(names):
        d[names[i]] = values[i]
      dicts.append(d)
    return dicts
  return None

if __name__ == '__main__':
  your_list = csv2dicts('file.csv')
  print your_list

Python将csv导入列表

问题：Python将csv导入列表

回答 0

回答 1

回答 2

熊猫＃2

熊猫＃3

Pandas #2

Pandas #3

回答 3

Python3更新：

Update for Python3:

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

检查字符串是否以列表中的字符串之一结尾

cv2.imshow命令在opencv-python中无法正常工作

py-spy：Python 程序的性能监控器

无法将“ Conda”识别为内部或外部命令

更好地协调两个numpy数组的更好方法

如何使用numpy.correlate进行自相关？

Python将csv导入列表

问题：Python将csv导入列表

回答 0

回答 1

回答 2

熊猫＃2

熊猫＃3

Pandas #2

Pandas #3

回答 3

Python3更新：

Update for Python3:

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

相关文章

排行榜展示

文章展示