计算CSV Python中有多少行？

Question 1

I’m using python (Django Framework) to read a CSV file. I pull just 2 lines out of this CSV as you can see. What I have been trying to do is store in a variable the total number of rows the CSV also.

How can I get the total number of rows?

file = object.myfilePath
fileObject = csv.reader(file)
for i in range(2):
    data.append(fileObject.next())

I have tried:

len(fileObject)
fileObject.length

Question 2

You need to count the number of rows:

row_count = sum(1 for row in fileObject)  # fileObject is your csv.reader

Using sum() with a generator expression makes for an efficient counter, avoiding storing the whole file in memory.

If you already read 2 rows to start with, then you need to add those 2 rows to your total; rows that have already been read are not being counted.

Question 3

2018-10-29 EDIT

Thank you for the comments.

I tested several kinds of code to get the number of lines in a csv file in terms of speed. The best method is below.

with open(filename) as f:
    sum(1 for line in f)

Here is the code tested.

import timeit
import csv
import pandas as pd

filename = './sample_submission.csv'

def talktime(filename, funcname, func):
    print(f"# {funcname}")
    t = timeit.timeit(f'{funcname}("{filename}")', setup=f'from __main__ import {funcname}', number = 100) / 100
    print('Elapsed time : ', t)
    print('n = ', func(filename))
    print('\n')

def sum1forline(filename):
    with open(filename) as f:
        return sum(1 for line in f)
talktime(filename, 'sum1forline', sum1forline)

def lenopenreadlines(filename):
    with open(filename) as f:
        return len(f.readlines())
talktime(filename, 'lenopenreadlines', lenopenreadlines)

def lenpd(filename):
    return len(pd.read_csv(filename)) + 1
talktime(filename, 'lenpd', lenpd)

def csvreaderfor(filename):
    cnt = 0
    with open(filename) as f:
        cr = csv.reader(f)
        for row in cr:
            cnt += 1
    return cnt
talktime(filename, 'csvreaderfor', csvreaderfor)

def openenum(filename):
    cnt = 0
    with open(filename) as f:
        for i, line in enumerate(f,1):
            cnt += 1
    return cnt
talktime(filename, 'openenum', openenum)

The result was below.

# sum1forline
Elapsed time :  0.6327946722068599
n =  2528244


# lenopenreadlines
Elapsed time :  0.655304473598555
n =  2528244


# lenpd
Elapsed time :  0.7561274056295324
n =  2528244


# csvreaderfor
Elapsed time :  1.5571560935772661
n =  2528244


# openenum
Elapsed time :  0.773000013928679
n =  2528244

In conclusion, sum(1 for line in f) is fastest. But there might not be significant difference from len(f.readlines()).

sample_submission.csv is 30.2MB and has 31 million characters.

Question 4

To do it you need to have a bit of code like my example here:

file = open("Task1.csv")
numline = len(file.readlines())
print (numline)

I hope this helps everyone.

Question 5

Several of the above suggestions count the number of LINES in the csv file. But some CSV files will contain quoted strings which themselves contain newline characters. MS CSV files usually delimit records with \r\n, but use \n alone within quoted strings.

For a file like this, counting lines of text (as delimited by newline) in the file will give too large a result. So for an accurate count you need to use csv.reader to read the records.

Question 6

First you have to open the file with open

input_file = open("nameOfFile.csv","r+")

Then use the csv.reader for open the csv

reader_file = csv.reader(input_file)

At the last, you can take the number of row with the instruction ‘len’

value = len(list(reader_file))

The total code is this:

input_file = open("nameOfFile.csv","r+")
reader_file = csv.reader(input_file)
value = len(list(reader_file))

Remember that if you want to reuse the csv file, you have to make a input_file.fseek(0), because when you use a list for the reader_file, it reads all file, and the pointer in the file change its position

Question 7

row_count = sum(1 for line in open(filename)) worked for me.

Note : sum(1 for line in csv.reader(filename)) seems to calculate the length of first line

Question 8

numline = len(file_read.readlines())

Question 9

when you instantiate a csv.reader object and you iter the whole file then you can access an instance variable called line_num providing the row count:

import csv
with open('csv_path_file') as f:
    csv_reader = csv.reader(f)
    for row in csv_reader:
        pass
    print(csv_reader.line_num)

Question 10

import csv
count = 0
with open('filename.csv', 'rb') as count_file:
    csv_reader = csv.reader(count_file)
    for row in csv_reader:
        count += 1

print count

Question 11

Use “list” to fit a more workably object.

You can then count, skip, mutate till your heart’s desire:

list(fileObject) #list values

len(list(fileObject)) # get length of file lines

list(fileObject)[10:] # skip first 10 lines

Question 12

You can also use a classic for loop:

import pandas as pd
df = pd.read_csv('your_file.csv')

count = 0
for i in df['a_column']:
    count = count + 1

print(count)

Question 13

might want to try something as simple as below in the command line:

sed -n '$=' filename

or

wc -l filename

Question 14

This works for csv and all files containing strings in Unix-based OSes:

import os

numOfLines = int(os.popen('wc -l < file.csv').read()[:-1])

In case the csv file contains a fields row you can deduct one from numOfLines above:

numOfLines = numOfLines - 1

Question 15

I think we can improve the best answer a little bit, I’m using:

len = sum(1 for _ in reader)

Moreover, we shouldnt forget pythonic code not always have the best performance in the project. In example: If we can do more operations at the same time in the same data set Its better to do all in the same bucle instead make two or more pythonic bucles.

Question 16

try

data = pd.read_csv("data.csv")
data.shape

and in the output you can see something like (aa,bb) where aa is the # of rows

Question 17

import pandas as pd
data = pd.read_csv('data.csv') 
totalInstances=len(data)

计算CSV Python中有多少行？

问题：计算CSV Python中有多少行？

回答 0

回答 1

2018-10-29编辑

2018-10-29 EDIT

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

在virtualenv中使用Python 3

如何在Python中创建具有不同线型的主要和次要网格线

Python的“ in”集合运算符

Seaborn地块未显示

如何在Mac上安装pip3？

如何将列表合并为元组列表？

计算CSV Python中有多少行？

问题：计算CSV Python中有多少行？

回答 0

回答 1

2018-10-29编辑

2018-10-29 EDIT

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

相关文章

排行榜展示

文章展示