问题:如何将文件逐行读取到列表中?

如何在Python中读取文件的每一行并将每一行作为元素存储在列表中?

我想逐行读取文件并将每行追加到列表的末尾。

How do I read every line of a file in Python and store each line as an element in a list?

I want to read the file line by line and append each line to the end of the list.


回答 0

with open(filename) as f:
    content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
content = [x.strip() for x in content] 
with open(filename) as f:
    content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
content = [x.strip() for x in content] 

回答 1

请参阅输入和输出

with open('filename') as f:
    lines = f.readlines()

或通过删除换行符:

with open('filename') as f:
    lines = [line.rstrip() for line in f]

See Input and Ouput:

with open('filename') as f:
    lines = f.readlines()

or with stripping the newline character:

with open('filename') as f:
    lines = [line.rstrip() for line in f]

回答 2

这比必要的要明确,但是可以满足您的要求。

with open("file.txt") as file_in:
    lines = []
    for line in file_in:
        lines.append(line)

This is more explicit than necessary, but does what you want.

with open("file.txt") as file_in:
    lines = []
    for line in file_in:
        lines.append(line)

回答 3

这将从文件中产生行的“数组”。

lines = tuple(open(filename, 'r'))

open返回可以迭代的文件。遍历文件时,您将从该文件中获取行。tuple可以使用一个迭代器,并从赋予它的迭代器中实例化一个元组实例。lines是从文件行创建的元组。

This will yield an “array” of lines from the file.

lines = tuple(open(filename, 'r'))

open returns a file which can be iterated over. When you iterate over a file, you get the lines from that file. tuple can take an iterator and instantiate a tuple instance for you from the iterator that you give it. lines is a tuple created from the lines of the file.


回答 4

如果要\n包括在内:

with open(fname) as f:
    content = f.readlines()

如果你不想 \n包括:

with open(fname) as f:
    content = f.read().splitlines()

If you want the \n included:

with open(fname) as f:
    content = f.readlines()

If you do not want \n included:

with open(fname) as f:
    content = f.read().splitlines()

回答 5

根据Python的文件对象方法,将文本文件转换为a的最简单方法list是:

with open('file.txt') as f:
    my_list = list(f)

如果只需要遍历文本文件行,则可以使用:

with open('file.txt') as f:
    for line in f:
       ...

旧答案:

使用withreadlines()

with open('file.txt') as f:
    lines = f.readlines()

如果您不关心关闭文件,则此单行代码有效:

lines = open('file.txt').readlines()

传统的方法:

f = open('file.txt') # Open file on read mode
lines = f.read().split("\n") # Create a list containing all lines
f.close() # Close file

According to Python’s Methods of File Objects, the simplest way to convert a text file into a list is:

with open('file.txt') as f:
    my_list = list(f)

If you just need to iterate over the text file lines, you can use:

with open('file.txt') as f:
    for line in f:
       ...

Old answer:

Using with and readlines() :

with open('file.txt') as f:
    lines = f.readlines()

If you don’t care about closing the file, this one-liner works:

lines = open('file.txt').readlines()

The traditional way:

f = open('file.txt') # Open file on read mode
lines = f.read().split("\n") # Create a list containing all lines
f.close() # Close file

回答 6

如建议的那样,您可以简单地执行以下操作:

with open('/your/path/file') as f:
    my_lines = f.readlines()

请注意,此方法有两个缺点:

1)您将所有行存储在内存中。在一般情况下,这是一个非常糟糕的主意。该文件可能非常大,并且可能会用完内存。即使它不大,也只是浪费内存。

2)不允许在阅读每行时对其进行处理。因此,如果您在此之后处理行,则效率不高(需要两次通过而不是一次)。

对于一般情况,更好的方法是:

with open('/your/path/file') as f:
    for line in f:
        process(line)

在任何需要的地方定义过程功能。例如:

def process(line):
    if 'save the world' in line.lower():
         superman.save_the_world()

Superman该类的实现留给您练习)。

这对于任何文件大小都可以很好地工作,而且您只需一遍就可以浏览文件。这通常是通用解析器的工作方式。

You could simply do the following, as has been suggested:

with open('/your/path/file') as f:
    my_lines = f.readlines()

Note that this approach has 2 downsides:

1) You store all the lines in memory. In the general case, this is a very bad idea. The file could be very large, and you could run out of memory. Even if it’s not large, it is simply a waste of memory.

2) This does not allow processing of each line as you read them. So if you process your lines after this, it is not efficient (requires two passes rather than one).

A better approach for the general case would be the following:

with open('/your/path/file') as f:
    for line in f:
        process(line)

Where you define your process function any way you want. For example:

def process(line):
    if 'save the world' in line.lower():
         superman.save_the_world()

(The implementation of the Superman class is left as an exercise for you).

This will work nicely for any file size and you go through your file in just 1 pass. This is typically how generic parsers will work.


回答 7

数据入列表

假设我们有一个文本文件,其数据如下行所示,

文字档内容:

line 1
line 2
line 3
  • 在同一目录中打开cmd(右键单击鼠标,然后选择cmd或PowerShell)
  • 运行python并在解释器中编写:

Python脚本:

>>> with open("myfile.txt", encoding="utf-8") as file:
...     x = [l.strip() for l in file]
>>> x
['line 1','line 2','line 3']

使用追加:

x = []
with open("myfile.txt") as file:
    for l in file:
        x.append(l.strip())

要么:

>>> x = open("myfile.txt").read().splitlines()
>>> x
['line 1', 'line 2', 'line 3']

要么:

>>> x = open("myfile.txt").readlines()
>>> x
['linea 1\n', 'line 2\n', 'line 3\n']

要么:

>>> y = [x.rstrip() for x in open("my_file.txt")]
>>> y
['line 1','line 2','line 3']


with open('testodiprova.txt', 'r', encoding='utf-8') as file:
    file = file.read().splitlines()
  print(file)

with open('testodiprova.txt', 'r', encoding='utf-8') as file:
  file = file.readlines()
  print(file)

Data into list

Assume that we have a text file with our data like in the following lines,

Text file content:

line 1
line 2
line 3
  • Open the cmd in the same directory (right-click the mouse and choose cmd or PowerShell)
  • Run python and in the interpreter write:

The Python script:

>>> with open("myfile.txt", encoding="utf-8") as file:
...     x = [l.strip() for l in file]
>>> x
['line 1','line 2','line 3']

Using append:

x = []
with open("myfile.txt") as file:
    for l in file:
        x.append(l.strip())

Or:

>>> x = open("myfile.txt").read().splitlines()
>>> x
['line 1', 'line 2', 'line 3']

Or:

>>> x = open("myfile.txt").readlines()
>>> x
['linea 1\n', 'line 2\n', 'line 3\n']

Or:

>>> y = [x.rstrip() for x in open("my_file.txt")]
>>> y
['line 1','line 2','line 3']


with open('testodiprova.txt', 'r', encoding='utf-8') as file:
    file = file.read().splitlines()
  print(file)

with open('testodiprova.txt', 'r', encoding='utf-8') as file:
  file = file.readlines()
  print(file)

回答 8

要将文件读入列表,您需要做三件事:

  • 开启档案
  • 读取文件
  • 将内容存储为列表

幸运的是,Python使执行这些操作变得非常容易,因此将文件读入列表的最短方法是:

lst = list(open(filename))

但是,我将添加更多解释。

打开文件

我假设您要打开特定文件,并且不直接处理文件句柄(或类似文件的句柄)。在Python中打开文件最常用的功能是open,它在Python 2.7中带有一个强制参数和两个可选参数:

  • 文件名
  • 模式
  • 缓冲(我将在此答案中忽略此参数)

文件名应该是代表文件路径的字符串。例如:

open('afile')   # opens the file named afile in the current working directory
open('adir/afile')            # relative path (relative to the current working directory)
open('C:/users/aname/afile')  # absolute path (windows)
open('/usr/local/afile')      # absolute path (linux)

请注意,需要指定文件扩展名。这对于Windows用户尤其重要,因为在资源管理器中查看时,默认情况下会隐藏文件扩展名(例如.txt.doc等)。

第二个参数是moder默认情况下表示“只读”。这正是您所需要的。

但是,如果您确实要创建文件和/或写入文件,则在此处需要使用其他参数。如果您需要概述,这是一个很好的答案

要读取文件,您可以省略mode或明确传递它:

open(filename)
open(filename, 'r')

两者都将以只读模式打开文件。如果要在Windows上读取二进制文件,则需要使用模式rb

open(filename, 'rb')

在其他平台上,'b'(二进制模式)将被忽略。


现在,我已经显示了如何处理open文件,让我们谈谈您总是需要close再次使用它的事实。否则,它将保持对文件的打开文件句柄,直到进程退出(或Python丢弃文件句柄)。

虽然您可以使用:

f = open(filename)
# ... do stuff with f
f.close()

当两者之间存在openclose引发异常时,将无法关闭文件。您可以使用try和来避免这种情况finally

f = open(filename)
# nothing in between!
try:
    # do stuff with f
finally:
    f.close()

但是,Python提供了具有更漂亮语法的上下文管理器(但与上面opentry和几乎相同finally):

with open(filename) as f:
    # do stuff with f
# The file is always closed after the with-scope ends.

最后一种方法是建议使用 Python打开文件的方法!

读取文件

好的,您已经打开了文件,现在如何读取?

open函数返回一个file对象,它支持Python的迭代协议。每次迭代都会给你一行:

with open(filename) as f:
    for line in f:
        print(line)

这将打印文件的每一行。但是请注意,每行\n的末尾都将包含一个换行符(您可能要检查您的Python是否具有通用换行符支持 -否则\r\n在Windows或\rMac 上也可以作为换行符)。如果您不希望这样做,可以简单地删除最后符(或Windows中的最后两个字符):

with open(filename) as f:
    for line in f:
        print(line[:-1])

但是最后一行不一定有尾随换行符,因此不应使用它。可以检查它是否以尾随换行符结尾,如果是这样,请将其删除:

with open(filename) as f:
    for line in f:
        if line.endswith('\n'):
            line = line[:-1]
        print(line)

但是您可以简单地\n字符串末尾删除所有空格(包括字符),这还将删除所有其他尾随空格,因此如果这些空格很重要,则必须小心:

with open(filename) as f:
    for line in f:
        print(f.rstrip())

但是,如果这些行以\r\n(Windows“ newlines”)结尾,.rstrip()则也将注意\r

将内容存储为列表

现在您知道了如何打开文件并阅读它,是时候将内容存储在列表中了。最简单的选择是使用以下list功能:

with open(filename) as f:
    lst = list(f)

如果要删除尾随的换行符,可以使用列表理解:

with open(filename) as f:
    lst = [line.rstrip() for line in f]

或更简单:默认情况下file对象的方法返回list以下行中的a:

with open(filename) as f:
    lst = f.readlines()

这还将包括尾随换行符,如果您不希望它们,我将推荐这种[line.rstrip() for line in f]方法,因为它避免了在内存中保留包含所有行的两个列表。

还有一个额外的选项来获得所需的输出,但是它是“次优的”:read将整个文件放在字符串中,然后在换行符上分割:

with open(filename) as f:
    lst = f.read().split('\n')

要么:

with open(filename) as f:
    lst = f.read().splitlines()

由于split不包含字符,因此它们会自动处理尾随的换行符。但是,它们并不理想,因为您将文件保留为字符串和内存中的行列表!

摘要

  • with open(...) as f在打开文件时使用,因为您无需自己关闭文件,即使发生某些异常也可以关闭文件。
  • file对象支持迭代协议,因此逐行读取文件就像一样简单for line in the_file_object:
  • 始终浏览文档以获取可用的功能/类。在大多数情况下,任务或至少一个或两个好的任务是一个完美的选择。在这种情况下,显而易见的选择是,readlines()但是如果您要在将行存储到列表中之前对其进行处理,我建议您使用简单的列表理解。

To read a file into a list you need to do three things:

  • Open the file
  • Read the file
  • Store the contents as list

Fortunately Python makes it very easy to do these things so the shortest way to read a file into a list is:

lst = list(open(filename))

However I’ll add some more explanation.

Opening the file

I assume that you want to open a specific file and you don’t deal directly with a file-handle (or a file-like-handle). The most commonly used function to open a file in Python is open, it takes one mandatory argument and two optional ones in Python 2.7:

  • Filename
  • Mode
  • Buffering (I’ll ignore this argument in this answer)

The filename should be a string that represents the path to the file. For example:

open('afile')   # opens the file named afile in the current working directory
open('adir/afile')            # relative path (relative to the current working directory)
open('C:/users/aname/afile')  # absolute path (windows)
open('/usr/local/afile')      # absolute path (linux)

Note that the file extension needs to be specified. This is especially important for Windows users because file extensions like .txt or .doc, etc. are hidden by default when viewed in the explorer.

The second argument is the mode, it’s r by default which means “read-only”. That’s exactly what you need in your case.

But in case you actually want to create a file and/or write to a file you’ll need a different argument here. There is an excellent answer if you want an overview.

For reading a file you can omit the mode or pass it in explicitly:

open(filename)
open(filename, 'r')

Both will open the file in read-only mode. In case you want to read in a binary file on Windows you need to use the mode rb:

open(filename, 'rb')

On other platforms the 'b' (binary mode) is simply ignored.


Now that I’ve shown how to open the file, let’s talk about the fact that you always need to close it again. Otherwise it will keep an open file-handle to the file until the process exits (or Python garbages the file-handle).

While you could use:

f = open(filename)
# ... do stuff with f
f.close()

That will fail to close the file when something between open and close throws an exception. You could avoid that by using a try and finally:

f = open(filename)
# nothing in between!
try:
    # do stuff with f
finally:
    f.close()

However Python provides context managers that have a prettier syntax (but for open it’s almost identical to the try and finally above):

with open(filename) as f:
    # do stuff with f
# The file is always closed after the with-scope ends.

The last approach is the recommended approach to open a file in Python!

Reading the file

Okay, you’ve opened the file, now how to read it?

The open function returns a file object and it supports Pythons iteration protocol. Each iteration will give you a line:

with open(filename) as f:
    for line in f:
        print(line)

This will print each line of the file. Note however that each line will contain a newline character \n at the end (you might want to check if your Python is built with universal newlines support – otherwise you could also have \r\n on Windows or \r on Mac as newlines). If you don’t want that you can could simply remove the last character (or the last two characters on Windows):

with open(filename) as f:
    for line in f:
        print(line[:-1])

But the last line doesn’t necessarily has a trailing newline, so one shouldn’t use that. One could check if it ends with a trailing newline and if so remove it:

with open(filename) as f:
    for line in f:
        if line.endswith('\n'):
            line = line[:-1]
        print(line)

But you could simply remove all whitespaces (including the \n character) from the end of the string, this will also remove all other trailing whitespaces so you have to be careful if these are important:

with open(filename) as f:
    for line in f:
        print(f.rstrip())

However if the lines end with \r\n (Windows “newlines”) that .rstrip() will also take care of the \r!

Store the contents as list

Now that you know how to open the file and read it, it’s time to store the contents in a list. The simplest option would be to use the list function:

with open(filename) as f:
    lst = list(f)

In case you want to strip the trailing newlines you could use a list comprehension instead:

with open(filename) as f:
    lst = [line.rstrip() for line in f]

Or even simpler: The method of the file object by default returns a list of the lines:

with open(filename) as f:
    lst = f.readlines()

This will also include the trailing newline characters, if you don’t want them I would recommend the [line.rstrip() for line in f] approach because it avoids keeping two lists containing all the lines in memory.

There’s an additional option to get the desired output, however it’s rather “suboptimal”: read the complete file in a string and then split on newlines:

with open(filename) as f:
    lst = f.read().split('\n')

or:

with open(filename) as f:
    lst = f.read().splitlines()

These take care of the trailing newlines automatically because the split character isn’t included. However they are not ideal because you keep the file as string and as a list of lines in memory!

Summary

  • Use with open(...) as f when opening files because you don’t need to take care of closing the file yourself and it closes the file even if some exception happens.
  • file objects support the iteration protocol so reading a file line-by-line is as simple as for line in the_file_object:.
  • Always browse the documentation for the available functions/classes. Most of the time there’s a perfect match for the task or at least one or two good ones. The obvious choice in this case would be readlines() but if you want to process the lines before storing them in the list I would recommend a simple list-comprehension.

回答 9

将文件中的行读入列表的简洁Python方式


首先,最重要的是,您应该专注于以高效且Python方式打开文件并读取其内容。这是我个人不喜欢的方式的一个示例:

infile = open('my_file.txt', 'r')  # Open the file for reading.

data = infile.read()  # Read the contents of the file.

infile.close()  # Close the file since we're done using it.

相反,我更喜欢以下打开文件进行读写的方法,因为它非常干净,并且在使用完文件后不需要关闭文件的额外步骤。在下面的语句中,我们将打开文件进行读取,并将其分配给变量“ infile”。一旦该语句中的代码运行完毕,该文件将自动关闭。

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

现在,我们需要集中精力将这些数据引入Python列表中,因为它们是可迭代的,高效的和灵活的。在您的情况下,理想的目标是将文本文件的每一行放入一个单独的元素中。为此,我们将使用splitlines()方法,如下所示:

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

最终产品:

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

测试我们的代码:

  • 文本文件的内容:
     A fost odatã ca-n povesti,
     A fost ca niciodatã,
     Din rude mãri împãrãtesti,
     O prea frumoasã fatã.
  • 打印报表以进行测试:
    print my_list  # Print the list.

    # Print each line in the list.
    for line in my_list:
        print line

    # Print the fourth element in this list.
    print my_list[3]
  • 输出(由于Unicode字符而外观不同):
     ['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
     'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
     frumoas\xc3\xa3 fat\xc3\xa3.']

     A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
     împãrãtesti, O prea frumoasã fatã.

     O prea frumoasã fatã.

Clean and Pythonic Way of Reading the Lines of a File Into a List


First and foremost, you should focus on opening your file and reading its contents in an efficient and pythonic way. Here is an example of the way I personally DO NOT prefer:

infile = open('my_file.txt', 'r')  # Open the file for reading.

data = infile.read()  # Read the contents of the file.

infile.close()  # Close the file since we're done using it.

Instead, I prefer the below method of opening files for both reading and writing as it is very clean, and does not require an extra step of closing the file once you are done using it. In the statement below, we’re opening the file for reading, and assigning it to the variable ‘infile.’ Once the code within this statement has finished running, the file will be automatically closed.

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

Now we need to focus on bringing this data into a Python List because they are iterable, efficient, and flexible. In your case, the desired goal is to bring each line of the text file into a separate element. To accomplish this, we will use the splitlines() method as follows:

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

The Final Product:

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

Testing Our Code:

  • Contents of the text file:
     A fost odatã ca-n povesti,
     A fost ca niciodatã,
     Din rude mãri împãrãtesti,
     O prea frumoasã fatã.
  • Print statements for testing purposes:
    print my_list  # Print the list.

    # Print each line in the list.
    for line in my_list:
        print line

    # Print the fourth element in this list.
    print my_list[3]
  • Output (different-looking because of unicode characters):
     ['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
     'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
     frumoas\xc3\xa3 fat\xc3\xa3.']

     A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
     împãrãtesti, O prea frumoasã fatã.

     O prea frumoasã fatã.

回答 10

在Python 3.4中引入,它pathlib具有从文件中读取文本的非常方便的方法,如下所示:

from pathlib import Path
p = Path('my_text_file')
lines = p.read_text().splitlines()

(该splitlines调用使它从包含文件全部内容的字符串变成文件中的行列表)。

pathlib有很多方便的地方。read_text简洁明了,您不必担心打开和关闭文件的麻烦。如果您需要一次性处理所有文件,那么这是一个不错的选择。

Introduced in Python 3.4, pathlib has a really convenient method for reading in text from files, as follows:

from pathlib import Path
p = Path('my_text_file')
lines = p.read_text().splitlines()

(The splitlines call is what turns it from a string containing the whole contents of the file to a list of lines in the file).

pathlib has a lot of handy conveniences in it. read_text is nice and concise, and you don’t have to worry about opening and closing the file. If all you need to do with the file is read it all in in one go, it’s a good choice.


回答 11

通过对文件使用列表推导,这是另一个选择。

lines = [line.rstrip() for line in open('file.txt')]

这应该是一种更有效的方法,因为大部分工作都在Python解释器中完成。

Here’s one more option by using list comprehensions on files;

lines = [line.rstrip() for line in open('file.txt')]

This should be more efficient way as the most of the work is done inside the Python interpreter.


回答 12

f = open("your_file.txt",'r')
out = f.readlines() # will append in the list out

现在,变量out是您想要的列表(数组)。您可以这样做:

for line in out:
    print (line)

要么:

for line in f:
    print (line)

您将获得相同的结果。

f = open("your_file.txt",'r')
out = f.readlines() # will append in the list out

Now variable out is a list (array) of what you want. You could either do:

for line in out:
    print (line)

Or:

for line in f:
    print (line)

You’ll get the same results.


回答 13

使用Python 2和Python 3读写文本文件;它适用于Unicode

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Define data
lines = ['     A first string  ',
         'A Unicode sample: €',
         'German: äöüß']

# Write text file
with open('file.txt', 'w') as fp:
    fp.write('\n'.join(lines))

# Read text file
with open('file.txt', 'r') as fp:
    read_lines = fp.readlines()
    read_lines = [line.rstrip('\n') for line in read_lines]

print(lines == read_lines)

注意事项:

  • with是所谓的上下文管理器。确保打开的文件再次关闭。
  • 这里所有产生.strip().rstrip()将无法复制的解决方案都将lines剥夺空白。

通用文件结尾

.txt

更高级的文件写入/读取

对于您的应用程序,以下内容可能很重要:

  • 其他编程语言的支持
  • 读写性能
  • 紧凑度(文件大小)

另请参阅:数据序列化格式的比较

如果您想寻找一种制作配置文件的方法,则可能需要阅读我的简短文章《Python中的配置文件》

Read and write text files with Python 2 and Python 3; it works with Unicode

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Define data
lines = ['     A first string  ',
         'A Unicode sample: €',
         'German: äöüß']

# Write text file
with open('file.txt', 'w') as fp:
    fp.write('\n'.join(lines))

# Read text file
with open('file.txt', 'r') as fp:
    read_lines = fp.readlines()
    read_lines = [line.rstrip('\n') for line in read_lines]

print(lines == read_lines)

Things to notice:

  • with is a so-called context manager. It makes sure that the opened file is closed again.
  • All solutions here which simply make .strip() or .rstrip() will fail to reproduce the lines as they also strip the white space.

Common file endings

.txt

More advanced file writing/reading

For your application, the following might be important:

  • Support by other programming languages
  • Reading/writing performance
  • Compactness (file size)

See also: Comparison of data serialization formats

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python.


回答 14

另一个选项是numpy.genfromtxt,例如:

import numpy as np
data = np.genfromtxt("yourfile.dat",delimiter="\n")

这将使dataNumPy数组具有与文件中一样多的行。

Another option is numpy.genfromtxt, for example:

import numpy as np
data = np.genfromtxt("yourfile.dat",delimiter="\n")

This will make data a NumPy array with as many rows as are in your file.


回答 15

如果您想从命令行或标准输入中读取文件,也可以使用以下fileinput模块:

# reader.py
import fileinput

content = []
for line in fileinput.input():
    content.append(line.strip())

fileinput.close()

像这样将文件传递给它:

$ python reader.py textfile.txt 

在此处阅读更多信息:http : //docs.python.org/2/library/fileinput.html

If you’d like to read a file from the command line or from stdin, you can also use the fileinput module:

# reader.py
import fileinput

content = []
for line in fileinput.input():
    content.append(line.strip())

fileinput.close()

Pass files to it like so:

$ python reader.py textfile.txt 

Read more here: http://docs.python.org/2/library/fileinput.html


回答 16

最简单的方法

一种简单的方法是:

  1. 以字符串形式读取整个文件
  2. 逐行拆分字符串

在一行中,这将给出:

lines = open('C:/path/file.txt').read().splitlines()

但是,这是一种非常低效的方式,因为它将在内存中存储2个版本的内容(对于小文件来说可能不是一个大问题,但仍然如此)。[谢谢马克·阿默里]。

有2种更简单的方法:

  1. 使用文件作为迭代器
lines = list(open('C:/path/file.txt'))
# ... or if you want to have a list without EOL characters
lines = [l.rstrip() for l in open('C:/path/file.txt')]
  1. 如果您使用的是Python 3.4或更高版本,请更好地pathlib为文件创建路径,以供程序中的其他操作使用:
from pathlib import Path
file_path = Path("C:/path/file.txt") 
lines = file_path.read_text().split_lines()
# ... or ... 
lines = [l.rstrip() for l in file_path.open()]

The simplest way to do it

A simple way is to:

  1. Read the whole file as a string
  2. Split the string line by line

In one line, that would give:

lines = open('C:/path/file.txt').read().splitlines()

However, this is quite inefficient way as this will store 2 versions of the content in memory (probably not a big issue for small files, but still). [Thanks Mark Amery].

There are 2 easier ways:

  1. Using the file as an iterator
lines = list(open('C:/path/file.txt'))
# ... or if you want to have a list without EOL characters
lines = [l.rstrip() for l in open('C:/path/file.txt')]
  1. If you are using Python 3.4 or above, better use pathlib to create a path for your file that you could use for other operations in your program:
from pathlib import Path
file_path = Path("C:/path/file.txt") 
lines = file_path.read_text().split_lines()
# ... or ... 
lines = [l.rstrip() for l in file_path.open()]

回答 17

只需使用splitlines()函数。这是一个例子。

inp = "file.txt"
data = open(inp)
dat = data.read()
lst = dat.splitlines()
print lst
# print(lst) # for python 3

在输出中,您将具有行列表。

Just use the splitlines() functions. Here is an example.

inp = "file.txt"
data = open(inp)
dat = data.read()
lst = dat.splitlines()
print lst
# print(lst) # for python 3

In the output you will have the list of lines.


回答 18

如果您想要面对一个非常大的文件,并且想要更快读取(假设您正在参加Topcoder / Hackerrank编码竞赛),则可以一次将相当大的几行读取到内存缓冲区中,而不是一次只是在文件级别逐行迭代。

buffersize = 2**16
with open(path) as f: 
    while True:
        lines_buffer = f.readlines(buffersize)
        if not lines_buffer:
            break
        for line in lines_buffer:
            process(line)

If you want to are faced with a very large / huge file and want to read faster (imagine you are in a Topcoder/Hackerrank coding competition), you might read a considerably bigger chunk of lines into a memory buffer at one time, rather than just iterate line by line at file level.

buffersize = 2**16
with open(path) as f: 
    while True:
        lines_buffer = f.readlines(buffersize)
        if not lines_buffer:
            break
        for line in lines_buffer:
            process(line)

回答 19

实现此目标的最简单方法是:

lines = list(open('filename'))

要么

lines = tuple(open('filename'))

要么

lines = set(open('filename'))

在使用的情况下set,必须记住,我们没有保留行顺序并摆脱了重复的行。

我在下面添加了@MarkAmery的重要补充:

由于您既不调用.close文件对象也不使用with语句,因此在某些Python实现中,文件在读取后可能不会关闭,并且您的进程将泄漏打开的文件句柄

CPython(大多数人使用的普通Python实现)中,这不是问题,因为文件对象将立即被垃圾收集并关闭文件,但是,尽管如此,它仍被认为是最佳实践,例如

with open('filename') as f: lines = list(f) 

以确保无论使用哪种Python实现,文件都将关闭。

The easiest ways to do that with some additional benefits are:

lines = list(open('filename'))

or

lines = tuple(open('filename'))

or

lines = set(open('filename'))

In the case with set, we must be remembered that we don’t have the line order preserved and get rid of the duplicated lines.

Below I added an important supplement from @MarkAmery:

Since you’re not calling .close on the file object nor using a with statement, in some Python implementations the file may not get closed after reading and your process will leak an open file handle.

In CPython (the normal Python implementation that most people use), this isn’t a problem since the file object will get immediately garbage-collected and this will close the file, but it’s nonetheless generally considered best practice to do something like:

with open('filename') as f: lines = list(f) 

to ensure that the file gets closed regardless of what Python implementation you’re using.


回答 20

用这个:

import pandas as pd
data = pd.read_csv(filename) # You can also add parameters such as header, sep, etc.
array = data.values

data是数据框类型,并使用值获取ndarray。您也可以使用来获得列表array.tolist()

Use this:

import pandas as pd
data = pd.read_csv(filename) # You can also add parameters such as header, sep, etc.
array = data.values

data is a dataframe type, and uses values to get ndarray. You can also get a list by using array.tolist().


回答 21

概述和总结

使用filename,从Path(filename)对象处理文件,或直接使用open(filename) as f,执行以下任一操作:

  • list(fileinput.input(filename))
  • 使用with path.open() as f,呼叫f.readlines()
  • list(f)
  • path.read_text().splitlines()
  • path.read_text().splitlines(keepends=True)
  • 遍历fileinput.input或,f并且list.append每行一次
  • 传递f给绑定list.extend方法
  • 用于f列表理解

我在下面解释了每个的用例。

在Python中,如何逐行读取文件?

这是一个很好的问题。首先,让我们创建一些示例数据:

from pathlib import Path
Path('filename').write_text('foo\nbar\nbaz')

文件对象是惰性的迭代器,因此只需对其进行迭代即可。

filename = 'filename'
with open(filename) as f:
    for line in f:
        line # do something with the line

或者,如果您有多个文件,请使用fileinput.input,另一个懒惰迭代器。仅一个文件:

import fileinput

for line in fileinput.input(filename): 
    line # process the line

或对于多个文件,向其传递文件名列表:

for line in fileinput.input([filename]*2): 
    line # process the line

再次,f并且fileinput.input在两者之上都是返回懒惰迭代器。您只能使用一次迭代器,因此在提供功能代码的同时避免了冗长性,我将fileinput.input(filename)在此处使用适当的简短程度。

在Python中,如何将文件逐行读入列表?

啊,但是出于某种原因您想要在列表中?如果可能,我会避免这种情况。但是,如果您坚持…只需将结果传递fileinput.input(filename)list

list(fileinput.input(filename))

另一个直接的答案是打电话 f.readlines,它返回文件的内容(最多可选hint数目的字符,因此您可以通过这种方式将其分解为多个列表)。

您可以通过两种方式获取此文件对象。一种方法是将文件名传递给open内置:

filename = 'filename'

with open(filename) as f:
    f.readlines()

或使用新的Path对象 pathlib模块中(我已经很喜欢它,并将在此处使用):

from pathlib import Path

path = Path(filename)

with path.open() as f:
    f.readlines()

list 也将使用文件迭代器并返回列表-同样是一个非常直接的方法:

with path.open() as f:
    list(f)

如果您不介意在拆分之前将整个文本作为单个字符串读取到内存中,则可以使用Path对象和splitlines()字符串方法将其作为一个单行进行。默认,splitlines删除换行符:

path.read_text().splitlines()

如果要保留换行符,请传递keepends=True

path.read_text().splitlines(keepends=True)

我想逐行读取文件并将每行追加到列表的末尾。

鉴于我们已经用几种方法轻松证明了最终结果,所以这有点愚蠢。但是您在创建列表时可能需要过滤或操作这些行,因此让我们对此请求进行幽默处理。

使用list.append可以让您在添加每一行之前对其进行过滤或操作:

line_list = []
for line in fileinput.input(filename):
    line_list.append(line)

line_list

使用list.extend会更直接一些,如果您已有一个列表,则可能会有用:

line_list = []
line_list.extend(fileinput.input(filename))
line_list

或更惯用的是,我们可以改用列表理解,并在需要时在其中进行映射和过滤:

[line for line in fileinput.input(filename)]

甚至更直接地,要闭合圆,只需将其传递到列表即可直接创建新列表,而无需在线操作:

list(fileinput.input(filename))

结论

您已经看到了许多将文件中的行放入列表中的方法,但是我建议您避免将大量数据具体化到列表中,而是尽可能使用Python的惰性迭代来处理数据。

也就是说,首选fileinput.inputwith path.open() as f

Outline and Summary

With a filename, handling the file from a Path(filename) object, or directly with open(filename) as f, do one of the following:

  • list(fileinput.input(filename))
  • using with path.open() as f, call f.readlines()
  • list(f)
  • path.read_text().splitlines()
  • path.read_text().splitlines(keepends=True)
  • iterate over fileinput.input or f and list.append each line one at a time
  • pass f to a bound list.extend method
  • use f in a list comprehension

I explain the use-case for each below.

In Python, how do I read a file line-by-line?

This is an excellent question. First, let’s create some sample data:

from pathlib import Path
Path('filename').write_text('foo\nbar\nbaz')

File objects are lazy iterators, so just iterate over it.

filename = 'filename'
with open(filename) as f:
    for line in f:
        line # do something with the line

Alternatively, if you have multiple files, use fileinput.input, another lazy iterator. With just one file:

import fileinput

for line in fileinput.input(filename): 
    line # process the line

or for multiple files, pass it a list of filenames:

for line in fileinput.input([filename]*2): 
    line # process the line

Again, f and fileinput.input above both are/return lazy iterators. You can only use an iterator one time, so to provide functional code while avoiding verbosity I’ll use the slightly more terse fileinput.input(filename) where apropos from here.

In Python, how do I read a file line-by-line into a list?

Ah but you want it in a list for some reason? I’d avoid that if possible. But if you insist… just pass the result of fileinput.input(filename) to list:

list(fileinput.input(filename))

Another direct answer is to call f.readlines, which returns the contents of the file (up to an optional hint number of characters, so you could break this up into multiple lists that way).

You can get to this file object two ways. One way is to pass the filename to the open builtin:

filename = 'filename'

with open(filename) as f:
    f.readlines()

or using the new Path object from the pathlib module (which I have become quite fond of, and will use from here on):

from pathlib import Path

path = Path(filename)

with path.open() as f:
    f.readlines()

list will also consume the file iterator and return a list – a quite direct method as well:

with path.open() as f:
    list(f)

If you don’t mind reading the entire text into memory as a single string before splitting it, you can do this as a one-liner with the Path object and the splitlines() string method. By default, splitlines removes the newlines:

path.read_text().splitlines()

If you want to keep the newlines, pass keepends=True:

path.read_text().splitlines(keepends=True)

I want to read the file line by line and append each line to the end of the list.

Now this is a bit silly to ask for, given that we’ve demonstrated the end result easily with several methods. But you might need to filter or operate on the lines as you make your list, so let’s humor this request.

Using list.append would allow you to filter or operate on each line before you append it:

line_list = []
for line in fileinput.input(filename):
    line_list.append(line)

line_list

Using list.extend would be a bit more direct, and perhaps useful if you have a preexisting list:

line_list = []
line_list.extend(fileinput.input(filename))
line_list

Or more idiomatically, we could instead use a list comprehension, and map and filter inside it if desirable:

[line for line in fileinput.input(filename)]

Or even more directly, to close the circle, just pass it to list to create a new list directly without operating on the lines:

list(fileinput.input(filename))

Conclusion

You’ve seen many ways to get lines from a file into a list, but I’d recommend you avoid materializing large quantities of data into a list and instead use Python’s lazy iteration to process the data if possible.

That is, prefer fileinput.input or with path.open() as f.


回答 22

如果文档中也有空行,我希望阅读内容并将其传递filter以防止空字符串元素

with open(myFile, "r") as f:
    excludeFileContent = list(filter(None, f.read().splitlines()))

In case that there are also empty lines in the document I like to read in the content and pass it through filter to prevent empty string elements

with open(myFile, "r") as f:
    excludeFileContent = list(filter(None, f.read().splitlines()))

回答 23

您也可以在NumPy中使用loadtxt命令。与genfromtxt相比,此方法检查的条件较少,因此可能更快。

import numpy
data = numpy.loadtxt(filename, delimiter="\n")

You could also use the loadtxt command in NumPy. This checks for fewer conditions than genfromtxt, so it may be faster.

import numpy
data = numpy.loadtxt(filename, delimiter="\n")

回答 24

我喜欢使用以下内容。立即阅读线路。

contents = []
for line in open(filepath, 'r').readlines():
    contents.append(line.strip())

或使用列表理解:

contents = [line.strip() for line in open(filepath, 'r').readlines()]

I like to use the following. Reading the lines immediately.

contents = []
for line in open(filepath, 'r').readlines():
    contents.append(line.strip())

Or using list comprehension:

contents = [line.strip() for line in open(filepath, 'r').readlines()]

回答 25

我会尝试以下提到的方法之一。我使用的示例文件的名称为dummy.txt。您可以在此处找到文件。我认为该文件与代码位于同一目录中(您可以更改fpath以包含正确的文件名和文件夹路径。)

在下面提到的两个示例中,所需的列表由给出lst

1.>第一种方法

fpath = 'dummy.txt'
with open(fpath, "r") as f: lst = [line.rstrip('\n \t') for line in f]

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

2.>第二种方法中,可以使用Python标准库中的csv.reader模块

import csv
fpath = 'dummy.txt'
with open(fpath) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter='   ')
    lst = [row[0] for row in csv_reader] 

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

您可以使用两种方法之一。创建时间lst在两种方法中时间几乎相等。

I would try one of the below mentioned methods. The example file that I use has the name dummy.txt. You can find the file here. I presume, that the file is in the same directory as the code (you can change fpath to include the proper file name and folder path.)

In both the below mentioned examples, the list that you want is given by lst.

1.> First method:

fpath = 'dummy.txt'
with open(fpath, "r") as f: lst = [line.rstrip('\n \t') for line in f]

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

2.> In the second method, one can use csv.reader module from Python Standard Library:

import csv
fpath = 'dummy.txt'
with open(fpath) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter='   ')
    lst = [row[0] for row in csv_reader] 

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

You can use either of the two methods. Time taken for the creation of lst is almost equal in the two methods.


回答 26

这是我用来简化文件I / O 的Python(3)帮助程序类:

import os

# handle files using a callback method, prevents repetition
def _FileIO__file_handler(file_path, mode, callback = lambda f: None):
  f = open(file_path, mode)
  try:
    return callback(f)
  except Exception as e:
    raise IOError("Failed to %s file" % ["write to", "read from"][mode.lower() in "r rb r+".split(" ")])
  finally:
    f.close()


class FileIO:
  # return the contents of a file
  def read(file_path, mode = "r"):
    return __file_handler(file_path, mode, lambda rf: rf.read())

  # get the lines of a file
  def lines(file_path, mode = "r", filter_fn = lambda line: len(line) > 0):
    return [line for line in FileIO.read(file_path, mode).strip().split("\n") if filter_fn(line)]

  # create or update a file (NOTE: can also be used to replace a file's original content)
  def write(file_path, new_content, mode = "w"):
    return __file_handler(file_path, mode, lambda wf: wf.write(new_content))

  # delete a file (if it exists)
  def delete(file_path):
    return os.remove() if os.path.isfile(file_path) else None

然后FileIO.lines,您将使用该函数,如下所示:

file_ext_lines = FileIO.lines("./path/to/file.ext"):
for i, line in enumerate(file_ext_lines):
  print("Line {}: {}".format(i + 1, line))

请记住,mode"r"默认情况下)和filter_fn(默认情况下检查空行)参数是可选的。

你甚至可以删除readwrite以及delete方法和刚离开FileIO.lines,甚至把它变成所谓的一个单独的方法read_lines

Here is a Python(3) helper library class that I use to simplify file I/O:

import os

# handle files using a callback method, prevents repetition
def _FileIO__file_handler(file_path, mode, callback = lambda f: None):
  f = open(file_path, mode)
  try:
    return callback(f)
  except Exception as e:
    raise IOError("Failed to %s file" % ["write to", "read from"][mode.lower() in "r rb r+".split(" ")])
  finally:
    f.close()


class FileIO:
  # return the contents of a file
  def read(file_path, mode = "r"):
    return __file_handler(file_path, mode, lambda rf: rf.read())

  # get the lines of a file
  def lines(file_path, mode = "r", filter_fn = lambda line: len(line) > 0):
    return [line for line in FileIO.read(file_path, mode).strip().split("\n") if filter_fn(line)]

  # create or update a file (NOTE: can also be used to replace a file's original content)
  def write(file_path, new_content, mode = "w"):
    return __file_handler(file_path, mode, lambda wf: wf.write(new_content))

  # delete a file (if it exists)
  def delete(file_path):
    return os.remove() if os.path.isfile(file_path) else None

You would then use the FileIO.lines function, like this:

file_ext_lines = FileIO.lines("./path/to/file.ext"):
for i, line in enumerate(file_ext_lines):
  print("Line {}: {}".format(i + 1, line))

Remember that the mode ("r" by default) and filter_fn (checks for empty lines by default) parameters are optional.

You could even remove the read, write and delete methods and just leave the FileIO.lines, or even turn it into a separate method called read_lines.


回答 27

命令行版本

#!/bin/python3
import os
import sys
abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
filename = dname + sys.argv[1]
arr = open(filename).read().split("\n") 
print(arr)

运行:

python3 somefile.py input_file_name.txt

Command line version

#!/bin/python3
import os
import sys
abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
filename = dname + sys.argv[1]
arr = open(filename).read().split("\n") 
print(arr)

Run with:

python3 somefile.py input_file_name.txt

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。