问题:使用Python删除文件中的特定行

假设我有一个充满昵称的文本文件。如何使用Python从此文件中删除特定的昵称?

Let’s say I have a text file full of nicknames. How can I delete a specific nickname from this file, using Python?


回答 0

首先,打开文件并从文件中获取所有行。然后以写模式重新打开文件并写回您的行,但要删除的行除外:

with open("yourfile.txt", "r") as f:
    lines = f.readlines()
with open("yourfile.txt", "w") as f:
    for line in lines:
        if line.strip("\n") != "nickname_to_delete":
            f.write(line)

您需要strip("\n")在比较中使用换行符,因为如果文件不以换行符结尾,则最后一个line也不行。

First, open the file and get all your lines from the file. Then reopen the file in write mode and write your lines back, except for the line you want to delete:

with open("yourfile.txt", "r") as f:
    lines = f.readlines()
with open("yourfile.txt", "w") as f:
    for line in lines:
        if line.strip("\n") != "nickname_to_delete":
            f.write(line)

You need to strip("\n") the newline character in the comparison because if your file doesn’t end with a newline character the very last line won’t either.


回答 1

仅需一次打开即可解决此问题:

with open("target.txt", "r+") as f:
    d = f.readlines()
    f.seek(0)
    for i in d:
        if i != "line you want to remove...":
            f.write(i)
    f.truncate()

此解决方案以r / w模式(“ r +”)打开文件,并使用一次seek重置f指针,然后在上次写入后截断以删除所有内容。

Solution to this problem with only a single open:

with open("target.txt", "r+") as f:
    d = f.readlines()
    f.seek(0)
    for i in d:
        if i != "line you want to remove...":
            f.write(i)
    f.truncate()

This solution opens the file in r/w mode (“r+”) and makes use of seek to reset the f-pointer then truncate to remove everything after the last write.


回答 2

在我看来,最好和最快的选择不是将所有内容存储在列表中并重新打开文件以将其写入,而是将文件重新写入其他位置。

with open("yourfile.txt", "r") as input:
    with open("newfile.txt", "w") as output: 
        for line in input:
            if line.strip("\n") != "nickname_to_delete":
                output.write(line)

而已!在一个循环中,只有一个循环您可以执行相同的操作。它将更快。

The best and fastest option, rather than storing everything in a list and re-opening the file to write it, is in my opinion to re-write the file elsewhere.

with open("yourfile.txt", "r") as input:
    with open("newfile.txt", "w") as output: 
        for line in input:
            if line.strip("\n") != "nickname_to_delete":
                output.write(line)

That’s it! In one loop and one only you can do the same thing. It will be much faster.


回答 3

这是@Lother答案的“叉子” (我认为应该认为是正确的答案)。


对于这样的文件:

$ cat file.txt 
1: october rust
2: november rain
3: december snow

Lother解决方案中的这个fork可以正常工作:

#!/usr/bin/python3.4

with open("file.txt","r+") as f:
    new_f = f.readlines()
    f.seek(0)
    for line in new_f:
        if "snow" not in line:
            f.write(line)
    f.truncate()

改进之处:

  • with open,放弃使用 f.close()
  • 更清晰地if/else评估当前行中是否不存在字符串

This is a “fork” from @Lother‘s answer (which I believe that should be considered the right answer).


For a file like this:

$ cat file.txt 
1: october rust
2: november rain
3: december snow

This fork from Lother’s solution works fine:

#!/usr/bin/python3.4

with open("file.txt","r+") as f:
    new_f = f.readlines()
    f.seek(0)
    for line in new_f:
        if "snow" not in line:
            f.write(line)
    f.truncate()

Improvements:

  • with open, which discard the usage of f.close()
  • more clearer if/else for evaluating if string is not present in the current line

回答 4

在第一遍中读取行并在第二遍中进行更改(删除特定行)的问题是,如果文件大小很大,则会用完RAM。相反,一种更好的方法是逐行读取行,并将其写入单独的文件中,从而消除不需要的行。我使用的文件大小高达12-50 GB,并且RAM使用率几乎保持不变。只有CPU周期显示正在进行处理。

The issue with reading lines in first pass and making changes (deleting specific lines) in the second pass is that if you file sizes are huge, you will run out of RAM. Instead, a better approach is to read lines, one by one, and write them into a separate file, eliminating the ones you don’t need. I have run this approach with files as big as 12-50 GB, and the RAM usage remains almost constant. Only CPU cycles show processing in progress.


回答 5

我喜欢此答案中所述的fileinput方法: 从文本文件(python)删除一行

举例来说,我有一个包含空行的文件,并且想要删除空行,这是我如何解决的方法:

import fileinput
import sys
for line_number, line in enumerate(fileinput.input('file1.txt', inplace=1)):
    if len(line) > 1:
            sys.stdout.write(line)

注意:我的空行长度为1

I liked the fileinput approach as explained in this answer: Deleting a line from a text file (python)

Say for example I have a file which has empty lines in it and I want to remove empty lines, here’s how I solved it:

import fileinput
import sys
for line_number, line in enumerate(fileinput.input('file1.txt', inplace=1)):
    if len(line) > 1:
            sys.stdout.write(line)

Note: The empty lines in my case had length 1


回答 6

如果使用Linux,则可以尝试以下方法。
假设您有一个名为的文本文件animal.txt

$ cat animal.txt  
dog
pig
cat 
monkey         
elephant  

删除第一行:

>>> import subprocess
>>> subprocess.call(['sed','-i','/.*dog.*/d','animal.txt']) 

然后

$ cat animal.txt
pig
cat
monkey
elephant

If you use Linux, you can try the following approach.
Suppose you have a text file named animal.txt:

$ cat animal.txt  
dog
pig
cat 
monkey         
elephant  

Delete the first line:

>>> import subprocess
>>> subprocess.call(['sed','-i','/.*dog.*/d','animal.txt']) 

then

$ cat animal.txt
pig
cat
monkey
elephant

回答 7

我认为,如果您将文件读入列表,则可以在列表上进行遍历以查找要删除的昵称。您可以高效地执行此操作,而无需创建其他文件,但是必须将结果写回到源文件中。

这是我可能的方法:

import, os, csv # and other imports you need
nicknames_to_delete = ['Nick', 'Stephen', 'Mark']

我假设nicknames.csv包含如下数据:

Nick
Maria
James
Chris
Mario
Stephen
Isabella
Ahmed
Julia
Mark
...

然后将文件加载到列表中:

 nicknames = None
 with open("nicknames.csv") as sourceFile:
     nicknames = sourceFile.read().splitlines()

接下来,迭代到列表以匹配要删除的输入:

for nick in nicknames_to_delete:
     try:
         if nick in nicknames:
             nicknames.pop(nicknames.index(nick))
         else:
             print(nick + " is not found in the file")
     except ValueError:
         pass

最后,将结果写回文件:

with open("nicknames.csv", "a") as nicknamesFile:
    nicknamesFile.seek(0)
    nicknamesFile.truncate()
    nicknamesWriter = csv.writer(nicknamesFile)
    for name in nicknames:
        nicknamesWriter.writeRow([str(name)])
nicknamesFile.close()

I think if you read the file into a list, then do the you can iterate over the list to look for the nickname you want to get rid of. You can do it much efficiently without creating additional files, but you’ll have to write the result back to the source file.

Here’s how I might do this:

import, os, csv # and other imports you need
nicknames_to_delete = ['Nick', 'Stephen', 'Mark']

I’m assuming nicknames.csv contains data like:

Nick
Maria
James
Chris
Mario
Stephen
Isabella
Ahmed
Julia
Mark
...

Then load the file into the list:

 nicknames = None
 with open("nicknames.csv") as sourceFile:
     nicknames = sourceFile.read().splitlines()

Next, iterate over to list to match your inputs to delete:

for nick in nicknames_to_delete:
     try:
         if nick in nicknames:
             nicknames.pop(nicknames.index(nick))
         else:
             print(nick + " is not found in the file")
     except ValueError:
         pass

Lastly, write the result back to file:

with open("nicknames.csv", "a") as nicknamesFile:
    nicknamesFile.seek(0)
    nicknamesFile.truncate()
    nicknamesWriter = csv.writer(nicknamesFile)
    for name in nicknames:
        nicknamesWriter.writeRow([str(name)])
nicknamesFile.close()

回答 8

一般来说,您不能;您必须再次写入整个文件(至少从更改到结束为止)。

在某些特定情况下,您可以做得更好-

如果所有数据元素的长度相同且没有特定顺序,并且您知道要删除的元素的偏移量,则可以将最后一项复制到要删除的项上,并在最后一项之前截断文件;

或者,您也可以在已保存的数据元素中用“这是不良数据,跳过它”的值覆盖数据块,或者在已保存的数据元素中保留“此项目已被删除”标志,这样就可以将其标记为已删除,而无需另外修改文件。

对于简短的文档(小于100 KB的内容?)来说,这可能是多余的。

In general, you can’t; you have to write the whole file again (at least from the point of change to the end).

In some specific cases you can do better than this –

if all your data elements are the same length and in no specific order, and you know the offset of the one you want to get rid of, you could copy the last item over the one to be deleted and truncate the file before the last item;

or you could just overwrite the data chunk with a ‘this is bad data, skip it’ value or keep a ‘this item has been deleted’ flag in your saved data elements such that you can mark it deleted without otherwise modifying the file.

This is probably overkill for short documents (anything under 100 KB?).


回答 9

可能您已经得到了正确的答案,但这是我的。readlines()我使用了两个文件,而不是使用列表来收集未过滤的数据(方法做了什么)。一个用于保存主数据,第二个用于删除特定字符串时过滤数据。这是一个代码:

main_file = open('data_base.txt').read()    # your main dataBase file
filter_file = open('filter_base.txt', 'w')
filter_file.write(main_file)
filter_file.close()
main_file = open('data_base.txt', 'w')
for line in open('filter_base'):
    if 'your data to delete' not in line:    # remove a specific string
        main_file.write(line)                # put all strings back to your db except deleted
    else: pass
main_file.close()

希望您会发现这个有用!:)

Probably, you already got a correct answer, but here is mine. Instead of using a list to collect unfiltered data (what readlines() method does), I use two files. One is for hold a main data, and the second is for filtering the data when you delete a specific string. Here is a code:

main_file = open('data_base.txt').read()    # your main dataBase file
filter_file = open('filter_base.txt', 'w')
filter_file.write(main_file)
filter_file.close()
main_file = open('data_base.txt', 'w')
for line in open('filter_base'):
    if 'your data to delete' not in line:    # remove a specific string
        main_file.write(line)                # put all strings back to your db except deleted
    else: pass
main_file.close()

Hope you will find this useful! :)


回答 10

将文件行保存在列表中,然后从列表中删除要删除的行,并将其余行写入新文件

with open("file_name.txt", "r") as f:
    lines = f.readlines() 
    lines.remove("Line you want to delete\n")
    with open("new_file.txt", "w") as new_f:
        for line in lines:        
            new_f.write(line)

Save the file lines in a list, then remove of the list the line you want to delete and write the remain lines to a new file

with open("file_name.txt", "r") as f:
    lines = f.readlines() 
    lines.remove("Line you want to delete\n")
    with open("new_file.txt", "w") as new_f:
        for line in lines:        
            new_f.write(line)

回答 11

这是从文件中删除某行的一些其他方法:

src_file = zzzz.txt
f = open(src_file, "r")
contents = f.readlines()
f.close()

contents.pop(idx) # remove the line item from list, by line number, starts from 0

f = open(src_file, "w")
contents = "".join(contents)
f.write(contents)
f.close()

here’s some other method to remove a/some line(s) from a file:

src_file = zzzz.txt
f = open(src_file, "r")
contents = f.readlines()
f.close()

contents.pop(idx) # remove the line item from list, by line number, starts from 0

f = open(src_file, "w")
contents = "".join(contents)
f.write(contents)
f.close()

回答 12

我喜欢使用fileinput和’inplace’方法的此方法:

import fileinput
for line in fileinput.input(fname, inplace =1):
    line = line.strip()
    if not 'UnwantedWord' in line:
        print(line)

它比其他答案少罗word,并且足够快

I like this method using fileinput and the ‘inplace’ method:

import fileinput
for line in fileinput.input(fname, inplace =1):
    line = line.strip()
    if not 'UnwantedWord' in line:
        print(line)

It’s a little less wordy than the other answers and is fast enough for


回答 13

您可以使用re图书馆

假设您能够加载完整的txt文件。然后,您定义不需要的昵称列表,然后将其替换为空字符串“”。

# Delete unwanted characters
import re

# Read, then decode for py2 compat.
path_to_file = 'data/nicknames.txt'
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

# Define unwanted nicknames and substitute them
unwanted_nickname_list = ['SourDough']
text = re.sub("|".join(unwanted_nickname_list), "", text)

You can use the re library

Assuming that you are able to load your full txt-file. You then define a list of unwanted nicknames and then substitute them with an empty string “”.

# Delete unwanted characters
import re

# Read, then decode for py2 compat.
path_to_file = 'data/nicknames.txt'
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

# Define unwanted nicknames and substitute them
unwanted_nickname_list = ['SourDough']
text = re.sub("|".join(unwanted_nickname_list), "", text)

回答 14

通过文件的行号删除文件的特​​定行

将变量filenameline_to_delete替换为文件名和要删除的行号。

filename = 'foo.txt'
line_to_delete = 3
initial_line = 1
file_lines = {}

with open(filename) as f:
    content = f.readlines() 

for line in content:
    file_lines[initial_line] = line.strip()
    initial_line += 1

f = open(filename, "w")
for line_number, line_content in file_lines.items():
    if line_number != line_to_delete:
        f.write('{}\n'.format(line_content))

f.close()
print('Deleted line: {}'.format(line_to_delete))

输出示例:

Deleted line: 3

To delete a specific line of a file by its line number:

Replace variables filename and line_to_delete with the name of your file and the line number you want to delete.

filename = 'foo.txt'
line_to_delete = 3
initial_line = 1
file_lines = {}

with open(filename) as f:
    content = f.readlines() 

for line in content:
    file_lines[initial_line] = line.strip()
    initial_line += 1

f = open(filename, "w")
for line_number, line_content in file_lines.items():
    if line_number != line_to_delete:
        f.write('{}\n'.format(line_content))

f.close()
print('Deleted line: {}'.format(line_to_delete))

Example output:

Deleted line: 3

回答 15

取文件内容,用换行符将其拆分为元组。然后,访问您的元组的行号,加入结果元组,然后覆盖该文件。

Take the contents of the file, split it by newline into a tuple. Then, access your tuple’s line number, join your result tuple, and overwrite to the file.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。