标签归档：file

知识问答

如何删除文件夹的内容？

2021年7月25日 Python实用宝典

问题：如何删除文件夹的内容？

如何在Python中删除本地文件夹的内容？

当前项目适用于Windows，但我也希望看到* nix。

How can I delete the contents of a local folder in Python?

The current project is for Windows, but I would like to see *nix also.

回答 0

import os, shutil
folder = '/path/to/folder'
for filename in os.listdir(folder):
    file_path = os.path.join(folder, filename)
    try:
        if os.path.isfile(file_path) or os.path.islink(file_path):
            os.unlink(file_path)
        elif os.path.isdir(file_path):
            shutil.rmtree(file_path)
    except Exception as e:
        print('Failed to delete %s. Reason: %s' % (file_path, e))

import os, shutil
folder = '/path/to/folder'
for filename in os.listdir(folder):
    file_path = os.path.join(folder, filename)
    try:
        if os.path.isfile(file_path) or os.path.islink(file_path):
            os.unlink(file_path)
        elif os.path.isdir(file_path):
            shutil.rmtree(file_path)
    except Exception as e:
        print('Failed to delete %s. Reason: %s' % (file_path, e))

回答 1

您可以简单地做到这一点：

import os
import glob

files = glob.glob('/YOUR/PATH/*')
for f in files:
    os.remove(f)

当然，您可以在路径中使用其他过滤器，例如：/YOU/PATH/*.txt，以删除目录中的所有文本文件。

You can simply do this:

import os
import glob

files = glob.glob('/YOUR/PATH/*')
for f in files:
    os.remove(f)

You can of course use an other filter in you path, for example : /YOU/PATH/*.txt for removing all text files in a directory.

回答 2

您可以使用以下命令删除文件夹本身及其所有内容shutil.rmtree：

import shutil
shutil.rmtree('/path/to/folder')

shutil.rmtree(path, ignore_errors=False, onerror=None)

删除整个目录树；路径必须指向目录（但不能指向目录的符号链接）。如果ignore_errors为true，则删除失败导致的错误将被忽略；如果为false或忽略，则通过调用onerror指定的处理程序来处理此类错误；如果省略，则引发异常。

You can delete the folder itself, as well as all its contents, using shutil.rmtree:

import shutil
shutil.rmtree('/path/to/folder')

shutil.rmtree(path, ignore_errors=False, onerror=None)

Delete an entire directory tree; path must point to a directory (but not a symbolic link to a directory). If ignore_errors is true, errors resulting from failed removals will be ignored; if false or omitted, such errors are handled by calling a handler specified by onerror or, if that is omitted, they raise an exception.

回答 3

扩展mhawke的答案，这就是我已经实现的方法。它会删除文件夹的所有内容，但不会删除文件夹本身。在Linux上使用文件，文件夹和符号链接进行了测试，也应该在Windows上运行。

import os
import shutil

for root, dirs, files in os.walk('/path/to/folder'):
    for f in files:
        os.unlink(os.path.join(root, f))
    for d in dirs:
        shutil.rmtree(os.path.join(root, d))

Expanding on mhawke’s answer this is what I’ve implemented. It removes all the content of a folder but not the folder itself. Tested on Linux with files, folders and symbolic links, should work on Windows as well.

import os
import shutil

for root, dirs, files in os.walk('/path/to/folder'):
    for f in files:
        os.unlink(os.path.join(root, f))
    for d in dirs:
        shutil.rmtree(os.path.join(root, d))

回答 4

使用rmtree和重新创建文件夹可能有效，但是删除并立即在网络驱动器上重新创建文件夹时遇到错误。

提议的使用walk的解决方案不起作用，因为它用于rmtree删除文件夹，然后可能会尝试使用os.unlink这些文件夹中以前的文件。这会导致错误。

发布的glob解决方案还将尝试删除非空文件夹，从而导致错误。

我建议您使用：

folder_path = '/path/to/folder'
for file_object in os.listdir(folder_path):
    file_object_path = os.path.join(folder_path, file_object)
    if os.path.isfile(file_object_path) or os.path.islink(file_object_path):
        os.unlink(file_object_path)
    else:
        shutil.rmtree(file_object_path)

Using rmtree and recreating the folder could work, but I have run into errors when deleting and immediately recreating folders on network drives.

The proposed solution using walk does not work as it uses rmtree to remove folders and then may attempt to use os.unlink on the files that were previously in those folders. This causes an error.

The posted glob solution will also attempt to delete non-empty folders, causing errors.

I suggest you use:

folder_path = '/path/to/folder'
for file_object in os.listdir(folder_path):
    file_object_path = os.path.join(folder_path, file_object)
    if os.path.isfile(file_object_path) or os.path.islink(file_object_path):
        os.unlink(file_object_path)
    else:
        shutil.rmtree(file_object_path)

回答 5

这个：

删除所有符号链接
- 无效链接
- 链接到目录
- 链接到文件
删除子目录
不删除父目录

码：

for filename in os.listdir(dirpath):
    filepath = os.path.join(dirpath, filename)
    try:
        shutil.rmtree(filepath)
    except OSError:
        os.remove(filepath)

与许多其他答案一样，这不会尝试调整权限以启用文件/目录的删除。

This:

removes all symbolic links
- dead links
- links to directories
- links to files
removes subdirectories
does not remove the parent directory

Code:

for filename in os.listdir(dirpath):
    filepath = os.path.join(dirpath, filename)
    try:
        shutil.rmtree(filepath)
    except OSError:
        os.remove(filepath)

As many other answers, this does not try to adjust permissions to enable removal of files/directories.

回答 6

作为单线：

import os

# Python 2.7
map( os.unlink, (os.path.join( mydir,f) for f in os.listdir(mydir)) )

# Python 3+
list( map( os.unlink, (os.path.join( mydir,f) for f in os.listdir(mydir)) ) )

一个考虑文件和目录的更健壮的解决方案是（2.7）：

def rm(f):
    if os.path.isdir(f): return os.rmdir(f)
    if os.path.isfile(f): return os.unlink(f)
    raise TypeError, 'must be either file or directory'

map( rm, (os.path.join( mydir,f) for f in os.listdir(mydir)) )

As a oneliner:

import os

# Python 2.7
map( os.unlink, (os.path.join( mydir,f) for f in os.listdir(mydir)) )

# Python 3+
list( map( os.unlink, (os.path.join( mydir,f) for f in os.listdir(mydir)) ) )

A more robust solution accounting for files and directories as well would be (2.7):

def rm(f):
    if os.path.isdir(f): return os.rmdir(f)
    if os.path.isfile(f): return os.unlink(f)
    raise TypeError, 'must be either file or directory'

map( rm, (os.path.join( mydir,f) for f in os.listdir(mydir)) )

回答 7

注意：万一有人否决了我的答案，请在此说明。

每个人都喜欢简短的“ n”个简单答案。但是，有时现实并非如此简单。
回到我的答案。我知道shutil.rmtree()可以用来删除目录树。我在自己的项目中使用了很多次。但是您必须意识到目录本身也会被删除shutil.rmtree()。尽管这对于某些人来说可能是可以接受的，但这对于删除文件夹的内容不是一个有效的答案（无副作用）。
我会给你一个副作用的例子。假设您有一个包含自定义所有者和模式位的目录，其中包含很多内容。然后，您使用删除它shutil.rmtree()并使用重建它os.mkdir()。然后，您将获得一个空目录，该目录具有默认（继承）的所有者和模式位。尽管您可能有权删除目录甚至目录，但是您可能无法在目录上设置原始所有者和模式位（例如，您不是超级用户）。
最后，请耐心阅读代码。它长而丑陋（可见），但事实证明是可靠且有效的（使用中）。

这是一个长而丑陋但可靠且有效的解决方案。

它解决了一些其他答复者无法解决的问题：

它可以正确处理符号链接，包括不调用shutil.rmtree()符号链接（os.path.isdir()如果它链接到目录，它将通过测试；甚至结果也os.walk()包含符号链接目录）。
它很好地处理了只读文件。

这是代码（唯一有用的函数是clear_dir()）：

import os
import stat
import shutil


# http://stackoverflow.com/questions/1889597/deleting-directory-in-python
def _remove_readonly(fn, path_, excinfo):
    # Handle read-only files and directories
    if fn is os.rmdir:
        os.chmod(path_, stat.S_IWRITE)
        os.rmdir(path_)
    elif fn is os.remove:
        os.lchmod(path_, stat.S_IWRITE)
        os.remove(path_)


def force_remove_file_or_symlink(path_):
    try:
        os.remove(path_)
    except OSError:
        os.lchmod(path_, stat.S_IWRITE)
        os.remove(path_)


# Code from shutil.rmtree()
def is_regular_dir(path_):
    try:
        mode = os.lstat(path_).st_mode
    except os.error:
        mode = 0
    return stat.S_ISDIR(mode)


def clear_dir(path_):
    if is_regular_dir(path_):
        # Given path is a directory, clear its content
        for name in os.listdir(path_):
            fullpath = os.path.join(path_, name)
            if is_regular_dir(fullpath):
                shutil.rmtree(fullpath, onerror=_remove_readonly)
            else:
                force_remove_file_or_symlink(fullpath)
    else:
        # Given path is a file or a symlink.
        # Raise an exception here to avoid accidentally clearing the content
        # of a symbolic linked directory.
        raise OSError("Cannot call clear_dir() on a symbolic link")

Notes: in case someone down voted my answer, I have something to explain here.

Everyone likes short ‘n’ simple answers. However, sometimes the reality is not so simple.
Back to my answer. I know shutil.rmtree() could be used to delete a directory tree. I’ve used it many times in my own projects. But you must realize that the directory itself will also be deleted by shutil.rmtree(). While this might be acceptable for some, it’s not a valid answer for deleting the contents of a folder (without side effects).
I’ll show you an example of the side effects. Suppose that you have a directory with customized owner and mode bits, where there are a lot of contents. Then you delete it with shutil.rmtree() and rebuild it with os.mkdir(). And you’ll get an empty directory with default (inherited) owner and mode bits instead. While you might have the privilege to delete the contents and even the directory, you might not be able to set back the original owner and mode bits on the directory (e.g. you’re not a superuser).
Finally, be patient and read the code. It’s long and ugly (in sight), but proven to be reliable and efficient (in use).

Here’s a long and ugly, but reliable and efficient solution.

It resolves a few problems which are not addressed by the other answerers:

It correctly handles symbolic links, including not calling shutil.rmtree() on a symbolic link (which will pass the os.path.isdir() test if it links to a directory; even the result of os.walk() contains symbolic linked directories as well).
It handles read-only files nicely.

Here’s the code (the only useful function is clear_dir()):

import os
import stat
import shutil


# http://stackoverflow.com/questions/1889597/deleting-directory-in-python
def _remove_readonly(fn, path_, excinfo):
    # Handle read-only files and directories
    if fn is os.rmdir:
        os.chmod(path_, stat.S_IWRITE)
        os.rmdir(path_)
    elif fn is os.remove:
        os.lchmod(path_, stat.S_IWRITE)
        os.remove(path_)


def force_remove_file_or_symlink(path_):
    try:
        os.remove(path_)
    except OSError:
        os.lchmod(path_, stat.S_IWRITE)
        os.remove(path_)


# Code from shutil.rmtree()
def is_regular_dir(path_):
    try:
        mode = os.lstat(path_).st_mode
    except os.error:
        mode = 0
    return stat.S_ISDIR(mode)


def clear_dir(path_):
    if is_regular_dir(path_):
        # Given path is a directory, clear its content
        for name in os.listdir(path_):
            fullpath = os.path.join(path_, name)
            if is_regular_dir(fullpath):
                shutil.rmtree(fullpath, onerror=_remove_readonly)
            else:
                force_remove_file_or_symlink(fullpath)
    else:
        # Given path is a file or a symlink.
        # Raise an exception here to avoid accidentally clearing the content
        # of a symbolic linked directory.
        raise OSError("Cannot call clear_dir() on a symbolic link")

回答 8

我感到惊讶的是，没有人提到pathlib做这项工作很棒。

如果您只想删除目录中的文件，则可以将其作为一个文件

from pathlib import Path

[f.unlink() for f in Path("/path/to/folder").glob("*") if f.is_file()]

要还递归地删除目录，您可以编写如下内容：

from pathlib import Path
from shutil import rmtree

for path in Path("/path/to/folder").glob("**/*"):
    if path.is_file():
        path.unlink()
    elif path.is_dir():
        rmtree(path)

I’m surprised nobody has mentioned the awesome pathlib to do this job.

If you only want to remove files in a directory it can be a oneliner

from pathlib import Path

[f.unlink() for f in Path("/path/to/folder").glob("*") if f.is_file()]

To also recursively remove directories you can write something like this:

from pathlib import Path
from shutil import rmtree

for path in Path("/path/to/folder").glob("**/*"):
    if path.is_file():
        path.unlink()
    elif path.is_dir():
        rmtree(path)

回答 9

import os
import shutil

# Gather directory contents
contents = [os.path.join(target_dir, i) for i in os.listdir(target_dir)]

# Iterate and remove each item in the appropriate manner
[os.remove(i) if os.path.isfile(i) or os.path.islink(i) else shutil.rmtree(i) for i in contents]

较早的注释还提到在Python 3.5+中使用os.scandir。例如：

import os
import shutil

with os.scandir(target_dir) as entries:
    for entry in entries:
        if entry.is_file() or entry.is_symlink():
            os.remove(entry.path)
        elif entry.is_dir():
            shutil.rmtree(entry.path)

import os
import shutil

# Gather directory contents
contents = [os.path.join(target_dir, i) for i in os.listdir(target_dir)]

# Iterate and remove each item in the appropriate manner
[os.remove(i) if os.path.isfile(i) or os.path.islink(i) else shutil.rmtree(i) for i in contents]

An earlier comment also mentions using os.scandir in Python 3.5+. For example:

import os
import shutil

with os.scandir(target_dir) as entries:
    for entry in entries:
        if entry.is_file() or entry.is_symlink():
            os.remove(entry.path)
        elif entry.is_dir():
            shutil.rmtree(entry.path)

回答 10

使用os.walk()此功能可能会更好。

os.listdir()不能将文件与目录区分开来，因此您在尝试取消链接时会很快遇到麻烦。有使用的一个很好的例子os.walk()递归删除目录在这里，以及如何使其适应你的情况提示。

You might be better off using os.walk() for this.

os.listdir() doesn’t distinguish files from directories and you will quickly get into trouble trying to unlink these. There is a good example of using os.walk() to recursively remove a directory here, and hints on how to adapt it to your circumstances.

回答 11

我曾经通过这种方式解决问题：

import shutil
import os

shutil.rmtree(dirpath)
os.mkdir(dirpath)

I used to solve the problem this way:

import shutil
import os

shutil.rmtree(dirpath)
os.mkdir(dirpath)

回答 12

另一个解决方案：

import sh
sh.rm(sh.glob('/path/to/folder/*'))

Yet Another Solution:

import sh
sh.rm(sh.glob('/path/to/folder/*'))

回答 13

我知道这是一个旧线程，但是我从python的官方站点发现了一些有趣的东西。只是为了分享另一个想法，即删除目录中的所有内容。因为在使用shutil.rmtree（）时遇到授权问题，所以我不想删除目录并重新创建它。原始地址为http://docs.python.org/2/library/os.html#os.walk。希望可以帮助到某人。

def emptydir(top):
    if(top == '/' or top == "\\"): return
    else:
        for root, dirs, files in os.walk(top, topdown=False):
            for name in files:
                os.remove(os.path.join(root, name))
            for name in dirs:
                os.rmdir(os.path.join(root, name))

I konw it’s an old thread but I have found something interesting from the official site of python. Just for sharing another idea for removing of all contents in a directory. Because I have some problems of authorization when using shutil.rmtree() and I don’t want to remove the directory and recreate it. The address original is http://docs.python.org/2/library/os.html#os.walk. Hope that could help someone.

def emptydir(top):
    if(top == '/' or top == "\\"): return
    else:
        for root, dirs, files in os.walk(top, topdown=False):
            for name in files:
                os.remove(os.path.join(root, name))
            for name in dirs:
                os.rmdir(os.path.join(root, name))

回答 14

要删除目录及其子目录中的所有文件而不删除文件夹本身，只需执行以下操作：

import os
mypath = "my_folder" #Enter your path here
for root, dirs, files in os.walk(mypath):
    for file in files:
        os.remove(os.path.join(root, file))

To delete all the files inside the directory as well as its sub-directories, without removing the folders themselves, simply do this:

import os
mypath = "my_folder" #Enter your path here
for root, dirs, files in os.walk(mypath):
    for file in files:
        os.remove(os.path.join(root, file))

回答 15

如果使用的是* nix系统，为什么不利用system命令？

import os
path = 'folder/to/clean'
os.system('rm -rf %s/*' % path)

If you are using a *nix system, why not leverage the system command?

import os
path = 'folder/to/clean'
os.system('rm -rf %s/*' % path)

回答 16

相当直观的方式：

import shutil, os


def remove_folder_contents(path):
    shutil.rmtree(path)
    os.makedirs(path)


remove_folder_contents('/path/to/folder')

Pretty intuitive way of doing it:

import shutil, os


def remove_folder_contents(path):
    shutil.rmtree(path)
    os.makedirs(path)


remove_folder_contents('/path/to/folder')

回答 17

好吧，我认为这段代码可以正常工作。它不会删除该文件夹，您可以使用此代码删除具有特定扩展名的文件。

import os
import glob

files = glob.glob(r'path/*')
for items in files:
    os.remove(items)

Well, I think this code is working. It will not delete the folder and you can use this code to delete files having the particular extension.

import os
import glob

files = glob.glob(r'path/*')
for items in files:
    os.remove(items)

回答 18

我必须从单个父目录中的3个单独的文件夹中删除文件：

directory
   folderA
      file1
   folderB
      file2
   folderC
      file3

这个简单的代码帮了我大忙：（我在Unix上）

import os
import glob

folders = glob.glob('./path/to/parentdir/*')
for fo in folders:
  file = glob.glob(f'{fo}/*')
  for f in file:
    os.remove(f)

希望这可以帮助。

I had to remove files from 3 separate folders inside a single parent directory:

directory
   folderA
      file1
   folderB
      file2
   folderC
      file3

This simple code did the trick for me: (I’m on Unix)

import os
import glob

folders = glob.glob('./path/to/parentdir/*')
for fo in folders:
  file = glob.glob(f'{fo}/*')
  for f in file:
    os.remove(f)

Hope this helps.

回答 19

我rmtree makedirs通过添加以下内容解决了该问题time.sleep()：

if os.path.isdir(folder_location):
    shutil.rmtree(folder_location)

time.sleep(.5)

os.makedirs(folder_location, 0o777)

I resolved the issue with rmtree makedirs by adding time.sleep() between:

if os.path.isdir(folder_location):
    shutil.rmtree(folder_location)

time.sleep(.5)

os.makedirs(folder_location, 0o777)

回答 20

回答有限的特定情况：假设您要在维护子文件夹树时删除文件，则可以使用递归算法：

import os

def recursively_remove_files(f):
    if os.path.isfile(f):
        os.unlink(f)
    elif os.path.isdir(f):
        for fi in os.listdir(f):
            recursively_remove_files(os.path.join(f, fi))

recursively_remove_files(my_directory)

也许有点题外话，但我认为许多人会觉得有用

Answer for a limited, specific situation: assuming you want to delete the files while maintainig the subfolders tree, you could use a recursive algorithm:

import os

def recursively_remove_files(f):
    if os.path.isfile(f):
        os.unlink(f)
    elif os.path.isdir(f):
        for fi in os.listdir(f):
            recursively_remove_files(os.path.join(f, fi))

recursively_remove_files(my_directory)

Maybe slightly off-topic, but I think many would find it useful

回答 21

假设temp_dir要删除，使用的单行命令os将是：

_ = [os.remove(os.path.join(save_dir,i)) for i in os.listdir(temp_dir)]

注意：这只是删除文件的1线。

希望这可以帮助。谢谢。

Assuming temp_dir to be deleted, a single line command using os would be:

_ = [os.remove(os.path.join(save_dir,i)) for i in os.listdir(temp_dir)]

Note: This is only a 1-liner for deleting files’ Doesn’t delete directories.

Hope this helps. Thanks.

回答 22

使用下面的方法删除目录的内容，而不是目录本身：

import os
import shutil

def remove_contents(path):
    for c in os.listdir(path):
        full_path = os.path.join(path, c)
        if os.path.isfile(full_path):
            os.remove(full_path)
        else:
            shutil.rmtree(full_path)

Use the method bellow to remove the contents of a directory, not the directory itself:

import os
import shutil

def remove_contents(path):
    for c in os.listdir(path):
        full_path = os.path.join(path, c)
        if os.path.isfile(full_path):
            os.remove(full_path)
        else:
            shutil.rmtree(full_path)

回答 23

删除文件夹中的所有文件/删除所有文件的最简单方法

import os
files = os.listdir(yourFilePath)
for f in files:
    os.remove(yourFilePath + f)

the easiest way to delete all files in a folder/remove all files

import os
files = os.listdir(yourFilePath)
for f in files:
    os.remove(yourFilePath + f)

回答 24

仅使用OS模块列出然后删除，就可以达到目的。

import os
DIR = os.list('Folder')
for i in range(len(DIR)):
    os.remove('Folder'+chr(92)+i)

为我工作，任何问题都让我知道！

This should do the trick just using the OS module to list and then remove!

import os
DIR = os.list('Folder')
for i in range(len(DIR)):
    os.remove('Folder'+chr(92)+i)

Worked for me, any problems let me know!

知识问答

如何在Python中获取文件创建和修改日期/时间？

2021年7月25日 Python实用宝典

问题：如何在Python中获取文件创建和修改日期/时间？

我有一个脚本，该脚本需要根据文件创建和修改日期执行一些操作，但是必须在Linux上运行和Windows。

在Python中进行文件创建和修改的最佳跨平台方法是什么？date/times

I have a script that needs to do some stuff based on file creation & modification dates but has to run on Linux & Windows.

What’s the best cross-platform way to get file creation & modification date/times in Python?

回答 0

以跨平台的方式获取某种修改日期很容易-只需调用，便会获得文件在以下位置时的Unix时间戳。os.path.getmtime(path)path最后修改时间。

另一方面，获取文件创建日期是不固定的，且取决于平台，即使在三个大型操作系统之间也有所不同：

在Windows上，文件的存储日期ctime（在https://msdn.microsoft.com/zh-cn/library/14h5k7ff.aspx中记录）。您可以通过os.path.getctime()或通过.st_ctime调用的结果属性在Python中进行访问os.stat()。在Unix 上一次更改文件的属性或内容的 Unix上，这将不起作用。ctime
在Mac以及其他一些基于Unix的操作系统上，您可以使用.st_birthtime调用结果的属性os.stat()。
在Linux上，当前是不可能的，至少没有为Python编写C扩展。尽管一些Linux常用的文件系统确实存储了创建日期（例如，ext4将它们存储在中st_crtime），但是Linux内核无法提供访问它们的方法；特别是从stat()最新的内核版本开始，它从C中的调用返回的结构不包含任何创建日期字段。您还可以看到，该标识符st_crtime当前在Python源代码中没有显示。至少在打开时ext4，数据会附加到文件系统中的inode上，但是没有方便的访问方法。

在Linux上，第二好的事情是mtime通过结果的任os.path.getmtime()一.st_mtime属性或属性访问文件的os.stat()。这将为您提供最后一次修改文件内容的时间，这对于某些用例可能已经足够。

综上所述，跨平台代码应如下所示：

import os
import platform

def creation_date(path_to_file):
    """
    Try to get the date that a file was created, falling back to when it was
    last modified if that isn't possible.
    See http://stackoverflow.com/a/39501288/1709587 for explanation.
    """
    if platform.system() == 'Windows':
        return os.path.getctime(path_to_file)
    else:
        stat = os.stat(path_to_file)
        try:
            return stat.st_birthtime
        except AttributeError:
            # We're probably on Linux. No easy way to get creation dates here,
            # so we'll settle for when its content was last modified.
            return stat.st_mtime

Getting some sort of modification date in a cross-platform way is easy – just call os.path.getmtime(path) and you’ll get the Unix timestamp of when the file at path was last modified.

Getting file creation dates, on the other hand, is fiddly and platform-dependent, differing even between the three big OSes:

On Windows, a file’s ctime (documented at https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) stores its creation date. You can access this in Python through os.path.getctime() or the .st_ctime attribute of the result of a call to os.stat(). This won’t work on Unix, where the ctime is the last time that the file’s attributes or content were changed.
On Mac, as well as some other Unix-based OSes, you can use the .st_birthtime attribute of the result of a call to os.stat().
On Linux, this is currently impossible, at least without writing a C extension for Python. Although some file systems commonly used with Linux do store creation dates (for example, ext4 stores them in st_crtime) , the Linux kernel offers no way of accessing them; in particular, the structs it returns from stat() calls in C, as of the latest kernel version, don’t contain any creation date fields. You can also see that the identifier st_crtime doesn’t currently feature anywhere in the Python source. At least if you’re on ext4, the data is attached to the inodes in the file system, but there’s no convenient way of accessing it.

The next-best thing on Linux is to access the file’s mtime, through either os.path.getmtime() or the .st_mtime attribute of an os.stat() result. This will give you the last time the file’s content was modified, which may be adequate for some use cases.

Putting this all together, cross-platform code should look something like this…

import os
import platform

def creation_date(path_to_file):
    """
    Try to get the date that a file was created, falling back to when it was
    last modified if that isn't possible.
    See http://stackoverflow.com/a/39501288/1709587 for explanation.
    """
    if platform.system() == 'Windows':
        return os.path.getctime(path_to_file)
    else:
        stat = os.stat(path_to_file)
        try:
            return stat.st_birthtime
        except AttributeError:
            # We're probably on Linux. No easy way to get creation dates here,
            # so we'll settle for when its content was last modified.
            return stat.st_mtime

回答 1

您有两种选择。首先，您可以使用os.path.getmtime和os.path.getctime功能：

import os.path, time
print("last modified: %s" % time.ctime(os.path.getmtime(file)))
print("created: %s" % time.ctime(os.path.getctime(file)))

您的另一个选择是使用os.stat：

import os, time
(mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(file)
print("last modified: %s" % time.ctime(mtime))

注：ctime()不不指创建时间在* nix系统，而是最后一次inode的数据变化。（感谢kojiro通过提供指向有趣的博客文章的链接使评论中的事实更加清楚）

You have a couple of choices. For one, you can use the os.path.getmtime and os.path.getctime functions:

import os.path, time
print("last modified: %s" % time.ctime(os.path.getmtime(file)))
print("created: %s" % time.ctime(os.path.getctime(file)))

Your other option is to use os.stat:

import os, time
(mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime) = os.stat(file)
print("last modified: %s" % time.ctime(mtime))

Note: ctime() does not refer to creation time on *nix systems, but rather the last time the inode data changed. (thanks to kojiro for making that fact more clear in the comments by providing a link to an interesting blog post)

回答 2

最好的功能是os.path.getmtime（）。在内部，这只是使用os.stat(filename).st_mtime。

datetime模块是最好的操作时间戳，因此您可以将修改日期作为这样的datetime对象获得：

import os
import datetime
def modification_date(filename):
    t = os.path.getmtime(filename)
    return datetime.datetime.fromtimestamp(t)

用法示例：

>>> d = modification_date('/var/log/syslog')
>>> print d
2009-10-06 10:50:01
>>> print repr(d)
datetime.datetime(2009, 10, 6, 10, 50, 1)

The best function to use for this is os.path.getmtime(). Internally, this just uses os.stat(filename).st_mtime.

The datetime module is the best manipulating timestamps, so you can get the modification date as a datetime object like this:

import os
import datetime
def modification_date(filename):
    t = os.path.getmtime(filename)
    return datetime.datetime.fromtimestamp(t)

Usage example:

>>> d = modification_date('/var/log/syslog')
>>> print d
2009-10-06 10:50:01
>>> print repr(d)
datetime.datetime(2009, 10, 6, 10, 50, 1)

回答 3

os.stat https://docs.python.org/2/library/stat.html#module-stat

编辑：在较新的代码中，您可能应该使用os.path.getmtime（）（感谢Christian Oudard），
但请注意，它返回的time_t浮点值只有小数秒（如果您的操作系统支持）

os.stat https://docs.python.org/2/library/stat.html#module-stat

edit: In newer code you should probably use os.path.getmtime() (thanks Christian Oudard)
but note that it returns a floating point value of time_t with fraction seconds (if your OS supports it)

回答 4

有两种获取mod时间的方法，os.path.getmtime（）或os.stat（），但是ctime不是可靠的跨平台（请参见下文）。

os.path.getmtime（）

getmtime（path）返回路径
的最后修改时间。返回值是一个数字，给出自纪元以来的秒数（请参见时间模块）。如果文件不存在或不可访问，请引发os.error。1.5.2版中的新功能。在版本2.3中进行了更改：如果os.stat_float_times（）返回True，则结果为浮点数。

os.stat（）

stat（path）
在给定路径上执行stat（）系统调用。返回值是一个对象，其属性与stat结构的成员相对应，即：st_mode（保护位），st_ino（索引节点号），st_dev（设备），st_nlink（硬链接数），st_uid（所有者的用户ID）），st_gid（所有者的组ID），st_size（文件大小，以字节为单位），st_atime（最新访问时间），st_mtime（最新内容修改时间），st_ctime（取决于平台；最新元数据更改的时间）在Unix上，或在Windows上创建的时间）：

>>> import os
>>> statinfo = os.stat('somefile.txt')
>>> statinfo
(33188, 422511L, 769L, 1, 1032, 100, 926L, 1105022698,1105022732, 1105022732)
>>> statinfo.st_size
926L
>>>

在上面的示例中，您将使用statinfo.st_mtime或statinfo.st_ctime分别获取mtime和ctime。

There are two methods to get the mod time, os.path.getmtime() or os.stat(), but the ctime is not reliable cross-platform (see below).

os.path.getmtime()

getmtime(path)
Return the time of last modification of path. The return value is a number giving the number of seconds since the epoch (see the time module). Raise os.error if the file does not exist or is inaccessible. New in version 1.5.2. Changed in version 2.3: If os.stat_float_times() returns True, the result is a floating point number.

os.stat()

stat(path)
Perform a stat() system call on the given path. The return value is an object whose attributes correspond to the members of the stat structure, namely: st_mode (protection bits), st_ino (inode number), st_dev (device), st_nlink (number of hard links), st_uid (user ID of owner), st_gid (group ID of owner), st_size (size of file, in bytes), st_atime (time of most recent access), st_mtime (time of most recent content modification), st_ctime (platform dependent; time of most recent metadata change on Unix, or the time of creation on Windows):

>>> import os
>>> statinfo = os.stat('somefile.txt')
>>> statinfo
(33188, 422511L, 769L, 1, 1032, 100, 926L, 1105022698,1105022732, 1105022732)
>>> statinfo.st_size
926L
>>>

In the above example you would use statinfo.st_mtime or statinfo.st_ctime to get the mtime and ctime, respectively.

回答 5

在Python 3.4及更高版本中，您可以使用面向对象的pathlib模块接口，该接口包括许多os模块的包装器。这是获取文件统计信息的示例。

>>> import pathlib
>>> fname = pathlib.Path('test.py')
>>> assert fname.exists(), f'No such file: {fname}'  # check that the file exists
>>> print(fname.stat())
os.stat_result(st_mode=33206, st_ino=5066549581564298, st_dev=573948050, st_nlink=1, st_uid=0, st_gid=0, st_size=413, st_atime=1523480272, st_mtime=1539787740, st_ctime=1523480272)

有关os.stat_result所含内容的更多信息，请参阅文档。对于您想要的修改时间fname.stat().st_mtime：

>>> import datetime
>>> mtime = datetime.datetime.fromtimestamp(fname.stat().st_mtime)
>>> print(mtime)
datetime.datetime(2018, 10, 17, 10, 49, 0, 249980)

如果要在Windows上创建时间，或者在Unix上需要最新的元数据更改，则可以使用fname.stat().st_ctime：

>>> ctime = datetime.datetime.fromtimestamp(fname.stat().st_ctime)
>>> print(ctime)
datetime.datetime(2018, 4, 11, 16, 57, 52, 151953)

本文提供了有关pathlib模块的更多有用信息和示例。

In Python 3.4 and above, you can use the object oriented pathlib module interface which includes wrappers for much of the os module. Here is an example of getting the file stats.

>>> import pathlib
>>> fname = pathlib.Path('test.py')
>>> assert fname.exists(), f'No such file: {fname}'  # check that the file exists
>>> print(fname.stat())
os.stat_result(st_mode=33206, st_ino=5066549581564298, st_dev=573948050, st_nlink=1, st_uid=0, st_gid=0, st_size=413, st_atime=1523480272, st_mtime=1539787740, st_ctime=1523480272)

For more information about what os.stat_result contains, refer to the documentation. For the modification time you want fname.stat().st_mtime:

>>> import datetime
>>> mtime = datetime.datetime.fromtimestamp(fname.stat().st_mtime)
>>> print(mtime)
datetime.datetime(2018, 10, 17, 10, 49, 0, 249980)

If you want the creation time on Windows, or the most recent metadata change on Unix, you would use fname.stat().st_ctime:

>>> ctime = datetime.datetime.fromtimestamp(fname.stat().st_ctime)
>>> print(ctime)
datetime.datetime(2018, 4, 11, 16, 57, 52, 151953)

This article has more helpful info and examples for the pathlib module.

回答 6

os.stat返回具有st_mtime和st_ctime属性的命名元组。修改时间st_mtime在两个平台上都一样；不幸的是，在Windows上ctime表示“创建时间”，而在POSIX上表示“更改时间”。我不知道有什么方法可以在POSIX平台上获得创建时间。

os.stat returns a named tuple with st_mtime and st_ctime attributes. The modification time is st_mtime on both platforms; unfortunately, on Windows, ctime means “creation time”, whereas on POSIX it means “change time”. I’m not aware of any way to get the creation time on POSIX platforms.

回答 7

import os, time, datetime

file = "somefile.txt"
print(file)

print("Modified")
print(os.stat(file)[-2])
print(os.stat(file).st_mtime)
print(os.path.getmtime(file))

print()

print("Created")
print(os.stat(file)[-1])
print(os.stat(file).st_ctime)
print(os.path.getctime(file))

print()

modified = os.path.getmtime(file)
print("Date modified: "+time.ctime(modified))
print("Date modified:",datetime.datetime.fromtimestamp(modified))
year,month,day,hour,minute,second=time.localtime(modified)[:-3]
print("Date modified: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

print()

created = os.path.getctime(file)
print("Date created: "+time.ctime(created))
print("Date created:",datetime.datetime.fromtimestamp(created))
year,month,day,hour,minute,second=time.localtime(created)[:-3]
print("Date created: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

版画

somefile.txt
Modified
1429613446
1429613446.0
1429613446.0

Created
1517491049
1517491049.28306
1517491049.28306

Date modified: Tue Apr 21 11:50:46 2015
Date modified: 2015-04-21 11:50:46
Date modified: 21/04/2015 11:50:46

Date created: Thu Feb  1 13:17:29 2018
Date created: 2018-02-01 13:17:29.283060
Date created: 01/02/2018 13:17:29

import os, time, datetime

file = "somefile.txt"
print(file)

print("Modified")
print(os.stat(file)[-2])
print(os.stat(file).st_mtime)
print(os.path.getmtime(file))

print()

print("Created")
print(os.stat(file)[-1])
print(os.stat(file).st_ctime)
print(os.path.getctime(file))

print()

modified = os.path.getmtime(file)
print("Date modified: "+time.ctime(modified))
print("Date modified:",datetime.datetime.fromtimestamp(modified))
year,month,day,hour,minute,second=time.localtime(modified)[:-3]
print("Date modified: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

print()

created = os.path.getctime(file)
print("Date created: "+time.ctime(created))
print("Date created:",datetime.datetime.fromtimestamp(created))
year,month,day,hour,minute,second=time.localtime(created)[:-3]
print("Date created: %02d/%02d/%d %02d:%02d:%02d"%(day,month,year,hour,minute,second))

prints

somefile.txt
Modified
1429613446
1429613446.0
1429613446.0

Created
1517491049
1517491049.28306
1517491049.28306

Date modified: Tue Apr 21 11:50:46 2015
Date modified: 2015-04-21 11:50:46
Date modified: 21/04/2015 11:50:46

Date created: Thu Feb  1 13:17:29 2018
Date created: 2018-02-01 13:17:29.283060
Date created: 01/02/2018 13:17:29

回答 8

>>> import os
>>> os.stat('feedparser.py').st_mtime
1136961142.0
>>> os.stat('feedparser.py').st_ctime
1222664012.233
>>>

>>> import os
>>> os.stat('feedparser.py').st_mtime
1136961142.0
>>> os.stat('feedparser.py').st_ctime
1222664012.233
>>>

回答 9

如果遵循符号链接并不重要，则也可以使用os.lstat内置函数。

>>> os.lstat("2048.py")
posix.stat_result(st_mode=33188, st_ino=4172202, st_dev=16777218L, st_nlink=1, st_uid=501, st_gid=20, st_size=2078, st_atime=1423378041, st_mtime=1423377552, st_ctime=1423377553)
>>> os.lstat("2048.py").st_atime
1423378041.0

If following symbolic links is not important, you can also use the os.lstat builtin.

>>> os.lstat("2048.py")
posix.stat_result(st_mode=33188, st_ino=4172202, st_dev=16777218L, st_nlink=1, st_uid=501, st_gid=20, st_size=2078, st_atime=1423378041, st_mtime=1423377552, st_ctime=1423377553)
>>> os.lstat("2048.py").st_atime
1423378041.0

回答 10

值得一看的是该crtime库实现了对文件创建时间的跨平台访问。

from crtime import get_crtimes_in_dir

for fname, date in get_crtimes_in_dir(".", raise_on_error=True, as_epoch=False):
    print(fname, date)
    # file_a.py Mon Mar 18 20:51:18 CET 2019

It may worth taking a look at the crtime library which implements cross-platform access to the file creation time.

from crtime import get_crtimes_in_dir

for fname, date in get_crtimes_in_dir(".", raise_on_error=True, as_epoch=False):
    print(fname, date)
    # file_a.py Mon Mar 18 20:51:18 CET 2019

回答 11

os.stat确实包括创建时间。对于os.stat()包含时间的元素，没有st_anything的定义。

所以试试这个：

os.stat('feedparser.py')[8]

将其与您在ls -lah中的文件上的创建日期进行比较

它们应该是相同的。

os.stat does include the creation time. There’s just no definition of st_anything for the element of os.stat() that contains the time.

So try this:

os.stat('feedparser.py')[8]

Compare that with your create date on the file in ls -lah

They should be the same.

回答 12

通过运行系统的stat命令并解析输出，我能够在posix上获得创建时间。

commands.getoutput('stat FILENAME').split('\"')[7]

从终端（OS X）在python外部运行stat返回：

805306374 3382786932 -rwx------ 1 km staff 0 1098083 "Aug 29 12:02:05 2013" "Aug 29 12:02:05 2013" "Aug 29 12:02:20 2013" "Aug 27 12:35:28 2013" 61440 2150 0 testfile.txt

…其中第四个datetime是文件创建时间（而不是ctime更改时间，如其他注释所述）。

I was able to get creation time on posix by running the system’s stat command and parsing the output.

commands.getoutput('stat FILENAME').split('\"')[7]

Running stat outside of python from Terminal (OS X) returned:

805306374 3382786932 -rwx------ 1 km staff 0 1098083 "Aug 29 12:02:05 2013" "Aug 29 12:02:05 2013" "Aug 29 12:02:20 2013" "Aug 27 12:35:28 2013" 61440 2150 0 testfile.txt

… where the fourth datetime is the file creation (rather than ctime change time as other comments noted).

知识问答

如何移动文件？

2021年7月25日 Python实用宝典

问题：如何移动文件？

我查看了Python os界面，但无法找到移动文件的方法。我将如何$ mv ...在Python中做相当于？

>>> source_files = '/PATH/TO/FOLDER/*'
>>> destination_folder = 'PATH/TO/FOLDER'
>>> # equivalent of $ mv source_files destination_folder

I looked into the Python os interface, but was unable to locate a method to move a file. How would I do the equivalent of $ mv ... in Python?

>>> source_files = '/PATH/TO/FOLDER/*'
>>> destination_folder = 'PATH/TO/FOLDER'
>>> # equivalent of $ mv source_files destination_folder

回答 0

os.rename()，shutil.move()或os.replace()

全部采用相同的语法：

import os
import shutil

os.rename("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
shutil.move("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
os.replace("path/to/current/file.foo", "path/to/new/destination/for/file.foo")

请注意，您必须file.foo在源参数和目标参数中都包含文件名（）。如果更改，文件将被重命名和移动。

还请注意，在前两种情况下，用于创建新文件的目录必须已经存在。在Windows上，必须不存在具有该名称的文件，否则将引发异常，但os.replace()即使在这种情况下，它也将以静默方式替换文件。

正如在对其他答案的评论中所指出的那样，在大多数情况下，shutil.move只需调用即可os.rename。但是，如果目标与源位于不同的磁盘上，它将代替复制然后删除源文件。

os.rename(), shutil.move(), or os.replace()

All employ the same syntax:

import os
import shutil

os.rename("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
shutil.move("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
os.replace("path/to/current/file.foo", "path/to/new/destination/for/file.foo")

Note that you must include the file name (file.foo) in both the source and destination arguments. If it is changed, the file will be renamed as well as moved.

Note also that in the first two cases the directory in which the new file is being created must already exist. On Windows, a file with that name must not exist or an exception will be raised, but os.replace() will silently replace a file even in that occurrence.

As has been noted in comments on other answers, shutil.move simply calls os.rename in most cases. However, if the destination is on a different disk than the source, it will instead copy and then delete the source file.

回答 1

尽管os.rename()并且shutil.move()都将重命名文件，但是最接近Unix mv命令的命令是shutil.move()。区别在于，os.rename()如果源和目标位于不同的磁盘上，则shutil.move()不起作用，而与文件所在的磁盘无关。

Although os.rename() and shutil.move() will both rename files, the command that is closest to the Unix mv command is shutil.move(). The difference is that os.rename() doesn’t work if the source and destination are on different disks, while shutil.move() doesn’t care what disk the files are on.

回答 2

对于os.rename或shutil.move，您将需要导入模块。要移动所有文件，无需*字符。

我们在/ opt / awesome处有一个名为source的文件夹，其中有一个名为awesome.txt的文件。

in /opt/awesome
○ → ls
source
○ → ls source
awesome.txt

python 
>>> source = '/opt/awesome/source'
>>> destination = '/opt/awesome/destination'
>>> import os
>>> os.rename(source, destination)
>>> os.listdir('/opt/awesome')
['destination']

我们使用os.listdir来查看文件夹名称实际上已更改。这是将目标移回源的途径。

>>> import shutil
>>> shutil.move(destination, source)
>>> os.listdir('/opt/awesome/source')
['awesome.txt']

这次，我在源文件夹中进行了检查，以确保我创建的awesome.txt文件存在。在那儿:)

现在，我们已经将文件夹及其文件从源移动到了目的地，然后又移回了。

For either the os.rename or shutil.move you will need to import the module. No * character is necessary to get all the files moved.

We have a folder at /opt/awesome called source with one file named awesome.txt.

in /opt/awesome
○ → ls
source
○ → ls source
awesome.txt

python 
>>> source = '/opt/awesome/source'
>>> destination = '/opt/awesome/destination'
>>> import os
>>> os.rename(source, destination)
>>> os.listdir('/opt/awesome')
['destination']

We used os.listdir to see that the folder name in fact changed. Here’s the shutil moving the destination back to source.

>>> import shutil
>>> shutil.move(destination, source)
>>> os.listdir('/opt/awesome/source')
['awesome.txt']

This time I checked inside the source folder to be sure the awesome.txt file I created exists. It is there :)

Now we have moved a folder and its files from a source to a destination and back again.

回答 3

在Python 3.4之后，您还可以使用pathlib的类Path移动文件。

from pathlib import Path

Path("path/to/current/file.foo").rename("path/to/new/destination/for/file.foo")

https://docs.python.org/3.4/library/pathlib.html#pathlib.Path.rename

After Python 3.4, you can also use pathlib‘s class Path to move file.

from pathlib import Path

Path("path/to/current/file.foo").rename("path/to/new/destination/for/file.foo")

https://docs.python.org/3.4/library/pathlib.html#pathlib.Path.rename

回答 4

这是我目前正在使用的：

import os, shutil
path = "/volume1/Users/Transfer/"
moveto = "/volume1/Users/Drive_Transfer/"
files = os.listdir(path)
files.sort()
for f in files:
    src = path+f
    dst = moveto+f
    shutil.move(src,dst)

现在功能齐全。希望这对您有所帮助。

编辑：

我已经将其转换为一个函数，该函数接受源目录和目标目录，并创建目标文件夹（如果不存在）并移动文件。还允许过滤src文件，例如，如果您只想移动图像，则使用pattern '*.jpg'，默认情况下，它将移动目录中的所有内容

import os, shutil, pathlib, fnmatch

def move_dir(src: str, dst: str, pattern: str = '*'):
    if not os.path.isdir(dst):
        pathlib.Path(dst).mkdir(parents=True, exist_ok=True)
    for f in fnmatch.filter(os.listdir(src), pattern):
        shutil.move(os.path.join(src, f), os.path.join(dst, f))

This is what I’m using at the moment:

import os, shutil
path = "/volume1/Users/Transfer/"
moveto = "/volume1/Users/Drive_Transfer/"
files = os.listdir(path)
files.sort()
for f in files:
    src = path+f
    dst = moveto+f
    shutil.move(src,dst)

Now fully functional. Hope this helps you.

Edit:

I’ve turned this into a function, that accepts a source and destination directory, making the destination folder if it doesn’t exist, and moves the files. Also allows for filtering of the src files, for example if you only want to move images, then you use the pattern '*.jpg', by default, it moves everything in the directory

import os, shutil, pathlib, fnmatch

def move_dir(src: str, dst: str, pattern: str = '*'):
    if not os.path.isdir(dst):
        pathlib.Path(dst).mkdir(parents=True, exist_ok=True)
    for f in fnmatch.filter(os.listdir(src), pattern):
        shutil.move(os.path.join(src, f), os.path.join(dst, f))

回答 5

可接受的答案不是正确的答案，因为问题不在于将文件重命名为文件，而是将许多文件移动到目录中。shutil.move会完成这项工作，但是为此目的os.rename是没有用的（如注释中所述），因为目标必须具有明确的文件名。

The accepted answer is not the right one, because the question is not about renaming a file into a file, but moving many files into a directory. shutil.move will do the work, but for this purpose os.rename is useless (as stated on comments) because destination must have an explicit file name.

回答 6

根据此处描述的答案，使用subprocess是另一种选择。

像这样：

subprocess.call("mv %s %s" % (source_files, destination_folder), shell=True)

与相比，我很想知道这种方法的优缺点shutil。因为就我而言，我已经subprocess出于其他原因使用了它，并且似乎可以使用，所以我倾向于坚持使用它。

可能取决于系统吗？

Based on the answer described here, using subprocess is another option.

Something like this:

subprocess.call("mv %s %s" % (source_files, destination_folder), shell=True)

I am curious to know the pro’s and con’s of this method compared to shutil. Since in my case I am already using subprocess for other reasons and it seems to work I am inclined to stick with it.

Is it system dependent maybe?

回答 7

这是解决方案，无法shell使用mv。

import subprocess

source      = 'pathToCurrent/file.foo'
destination = 'pathToNew/file.foo'

p = subprocess.Popen(['mv', source, destination], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
res = p.communicate()[0].decode('utf-8').strip()

if p.returncode:
    print 'ERROR: ' + res

This is solution, which does not enables shell using mv.

import subprocess

source      = 'pathToCurrent/file.foo'
destination = 'pathToNew/file.foo'

p = subprocess.Popen(['mv', source, destination], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
res = p.communicate()[0].decode('utf-8').strip()

if p.returncode:
    print 'ERROR: ' + res

回答 8

  import os,shutil

  current_path = "" ## source path

  new_path = "" ## destination path

  os.chdir(current_path)

  for files in os.listdir():

        os.rename(files, new_path+'{}'.format(f))
        shutil.move(files, new_path+'{}'.format(f)) ## to move files from

不同的磁盘 C：-> D：

  import os,shutil

  current_path = "" ## source path

  new_path = "" ## destination path

  os.chdir(current_path)

  for files in os.listdir():

        os.rename(files, new_path+'{}'.format(f))
        shutil.move(files, new_path+'{}'.format(f)) ## to move files from

different disk ex. C: –> D:

知识问答

如何删除/删除不为空的文件夹？

2021年7月25日 Python实用宝典

问题：如何删除/删除不为空的文件夹？

尝试删除不为空的文件夹时，出现“访问被拒绝”错误。我尝试使用以下命令：os.remove("/folder_name")。

删除/删除不为空的文件夹/目录的最有效方法是什么？

I am getting an ‘access is denied’ error when I attempt to delete a folder that is not empty. I used the following command in my attempt: os.remove("/folder_name").

What is the most effective way of removing/deleting a folder/directory that is not empty?

回答 0

import shutil

shutil.rmtree('/folder_name')

标准库参考：shutil.rmtree。

根据设计，rmtree在包含只读文件的文件夹树上失败。如果要删除该文件夹而不管它是否包含只读文件，请使用

shutil.rmtree('/folder_name', ignore_errors=True)

import shutil

shutil.rmtree('/folder_name')

Standard Library Reference: shutil.rmtree.

By design, rmtree fails on folder trees containing read-only files. If you want the folder to be deleted regardless of whether it contains read-only files, then use

shutil.rmtree('/folder_name', ignore_errors=True)

回答 1

从Python文档上os.walk()：

# Delete everything reachable from the directory named in 'top',
# assuming there are no symbolic links.
# CAUTION:  This is dangerous!  For example, if top == '/', it
# could delete all your disk files.
import os
for root, dirs, files in os.walk(top, topdown=False):
    for name in files:
        os.remove(os.path.join(root, name))
    for name in dirs:
        os.rmdir(os.path.join(root, name))

From the python docs on os.walk():

# Delete everything reachable from the directory named in 'top',
# assuming there are no symbolic links.
# CAUTION:  This is dangerous!  For example, if top == '/', it
# could delete all your disk files.
import os
for root, dirs, files in os.walk(top, topdown=False):
    for name in files:
        os.remove(os.path.join(root, name))
    for name in dirs:
        os.rmdir(os.path.join(root, name))

回答 2

import shutil
shutil.rmtree(dest, ignore_errors=True)

import shutil
shutil.rmtree(dest, ignore_errors=True)

回答 3

从python 3.4您可以使用：

import pathlib

def delete_folder(pth) :
    for sub in pth.iterdir() :
        if sub.is_dir() :
            delete_folder(sub)
        else :
            sub.unlink()
    pth.rmdir() # if you just want to delete dir content, remove this line

这里pth是一个pathlib.Path实例。不错，但可能不是最快的。

from python 3.4 you may use :

import pathlib

def delete_folder(pth) :
    for sub in pth.iterdir() :
        if sub.is_dir() :
            delete_folder(sub)
        else :
            sub.unlink()
    pth.rmdir() # if you just want to delete dir content, remove this line

where pth is a pathlib.Path instance. Nice, but may not be the fastest.

回答 4

来自docs.python.org：

本示例说明如何在Windows上删除目录树，其中某些文件的只读位已设置。它使用onerror回调清除只读位并重新尝试删除。任何后续故障都将传播。
import os, stat
import shutil

def remove_readonly(func, path, _):
    "Clear the readonly bit and reattempt the removal"
    os.chmod(path, stat.S_IWRITE)
    func(path)

shutil.rmtree(directory, onerror=remove_readonly)

From docs.python.org:

This example shows how to remove a directory tree on Windows where some of the files have their read-only bit set. It uses the onerror callback to clear the readonly bit and reattempt the remove. Any subsequent failure will propagate.
import os, stat
import shutil

def remove_readonly(func, path, _):
    "Clear the readonly bit and reattempt the removal"
    os.chmod(path, stat.S_IWRITE)
    func(path)

shutil.rmtree(directory, onerror=remove_readonly)

回答 5

import os
import stat
import shutil

def errorRemoveReadonly(func, path, exc):
    excvalue = exc[1]
    if func in (os.rmdir, os.remove) and excvalue.errno == errno.EACCES:
        # change the file to be readable,writable,executable: 0777
        os.chmod(path, stat.S_IRWXU | stat.S_IRWXG | stat.S_IRWXO)  
        # retry
        func(path)
    else:
        # raiseenter code here

shutil.rmtree(path, ignore_errors=False, onerror=errorRemoveReadonly)

如果设置了ignore_errors，错误将被忽略；否则，如果设置了onerror，则将使用参数（函数，路径，exc_info）来处理错误，其中func为os.listdir，os.remove或os.rmdir；path是导致该函数失败的参数。而exc_info是sys.exc_info（）返回的元组。如果ignore_errors为false并且onerror为None，则会引发异常。在此处输入代码

import os
import stat
import shutil

def errorRemoveReadonly(func, path, exc):
    excvalue = exc[1]
    if func in (os.rmdir, os.remove) and excvalue.errno == errno.EACCES:
        # change the file to be readable,writable,executable: 0777
        os.chmod(path, stat.S_IRWXU | stat.S_IRWXG | stat.S_IRWXO)  
        # retry
        func(path)
    else:
        # raiseenter code here

shutil.rmtree(path, ignore_errors=False, onerror=errorRemoveReadonly)

If ignore_errors is set, errors are ignored; otherwise, if onerror is set, it is called to handle the error with arguments (func, path, exc_info) where func is os.listdir, os.remove, or os.rmdir; path is the argument to that function that caused it to fail; and exc_info is a tuple returned by sys.exc_info(). If ignore_errors is false and onerror is None, an exception is raised.enter code here

回答 6

根据kkubasik的回答，删除之前检查文件夹是否存在，更可靠

import shutil
def remove_folder(path):
    # check if folder exists
    if os.path.exists(path):
         # remove if exists
         shutil.rmtree(path)
    else:
         # throw your exception to handle this special scenario
         raise XXError("your exception") 
remove_folder("/folder_name")

Base on kkubasik’s answer, check if folder exists before remove, more robust

import shutil
def remove_folder(path):
    # check if folder exists
    if os.path.exists(path):
         # remove if exists
         shutil.rmtree(path)
    else:
         # throw your exception to handle this special scenario
         raise XXError("your exception") 
remove_folder("/folder_name")

回答 7

如果您确定要删除整个目录树，并且不再对目录内容感兴趣，那么爬网整个目录树是愚蠢的……只需从python调用本机OS命令即可。它将更快，更有效且内存消耗更少。

RMDIR c:\blah /s /q

或* nix

rm -rf /home/whatever

在python中，代码看起来像..

import sys
import os

mswindows = (sys.platform == "win32")

def getstatusoutput(cmd):
    """Return (status, output) of executing cmd in a shell."""
    if not mswindows:
        return commands.getstatusoutput(cmd)
    pipe = os.popen(cmd + ' 2>&1', 'r')
    text = pipe.read()
    sts = pipe.close()
    if sts is None: sts = 0
    if text[-1:] == '\n': text = text[:-1]
    return sts, text


def deleteDir(path):
    """deletes the path entirely"""
    if mswindows: 
        cmd = "RMDIR "+ path +" /s /q"
    else:
        cmd = "rm -rf "+path
    result = getstatusoutput(cmd)
    if(result[0]!=0):
        raise RuntimeError(result[1])

if you are sure, that you want to delete the entire dir tree, and are no more interested in contents of dir, then crawling for entire dir tree is stupidness… just call native OS command from python to do that. It will be faster, efficient and less memory consuming.

RMDIR c:\blah /s /q

or *nix

rm -rf /home/whatever

In python, the code will look like..

import sys
import os

mswindows = (sys.platform == "win32")

def getstatusoutput(cmd):
    """Return (status, output) of executing cmd in a shell."""
    if not mswindows:
        return commands.getstatusoutput(cmd)
    pipe = os.popen(cmd + ' 2>&1', 'r')
    text = pipe.read()
    sts = pipe.close()
    if sts is None: sts = 0
    if text[-1:] == '\n': text = text[:-1]
    return sts, text


def deleteDir(path):
    """deletes the path entirely"""
    if mswindows: 
        cmd = "RMDIR "+ path +" /s /q"
    else:
        cmd = "rm -rf "+path
    result = getstatusoutput(cmd)
    if(result[0]!=0):
        raise RuntimeError(result[1])

回答 8

只需一些python 3.5选项即可完成上述答案。（我很想在这里找到他们）。

import os
import shutil
from send2trash import send2trash # (shutil delete permanently)

删除文件夹（如果为空）

root = r"C:\Users\Me\Desktop\test"   
for dir, subdirs, files in os.walk(root):   
    if subdirs == [] and files == []:
           send2trash(dir)
           print(dir, ": folder removed")

如果包含此文件的文件夹也删除

    elif subdirs == [] and len(files) == 1: # if contains no sub folder and only 1 file 
        if files[0]== "desktop.ini" or:  
            send2trash(dir)
            print(dir, ": folder removed")
        else:
            print(dir)

如果仅包含.srt或.txt文件，则删除文件夹

    elif subdirs == []: #if dir doesn’t contains subdirectory
        ext = (".srt", ".txt")
        contains_other_ext=0
        for file in files:
            if not file.endswith(ext):  
                contains_other_ext=True
        if contains_other_ext== 0:
                send2trash(dir)
                print(dir, ": dir deleted")

删除小于400kb的文件夹：

def get_tree_size(path):
    """Return total size of files in given path and subdirs."""
    total = 0
    for entry in os.scandir(path):
        if entry.is_dir(follow_symlinks=False):
            total += get_tree_size(entry.path)
        else:
            total += entry.stat(follow_symlinks=False).st_size
    return total


for dir, subdirs, files in os.walk(root):   
    If get_tree_size(dir) < 400000:  # ≈ 400kb
        send2trash(dir)
    print(dir, "dir deleted")

Just some python 3.5 options to complete the answers above. (I would have loved to find them here).

import os
import shutil
from send2trash import send2trash # (shutil delete permanently)

Delete folder if empty

root = r"C:\Users\Me\Desktop\test"   
for dir, subdirs, files in os.walk(root):   
    if subdirs == [] and files == []:
           send2trash(dir)
           print(dir, ": folder removed")

Delete also folder if it contains this file

    elif subdirs == [] and len(files) == 1: # if contains no sub folder and only 1 file 
        if files[0]== "desktop.ini" or:  
            send2trash(dir)
            print(dir, ": folder removed")
        else:
            print(dir)

delete folder if it contains only .srt or .txt file(s)

    elif subdirs == []: #if dir doesn’t contains subdirectory
        ext = (".srt", ".txt")
        contains_other_ext=0
        for file in files:
            if not file.endswith(ext):  
                contains_other_ext=True
        if contains_other_ext== 0:
                send2trash(dir)
                print(dir, ": dir deleted")

Delete folder if its size is less than 400kb :

def get_tree_size(path):
    """Return total size of files in given path and subdirs."""
    total = 0
    for entry in os.scandir(path):
        if entry.is_dir(follow_symlinks=False):
            total += get_tree_size(entry.path)
        else:
            total += entry.stat(follow_symlinks=False).st_size
    return total


for dir, subdirs, files in os.walk(root):   
    If get_tree_size(dir) < 400000:  # ≈ 400kb
        send2trash(dir)
    print(dir, "dir deleted")

回答 9

我想添加“纯路径库”方法：

from pathlib import Path
from typing import Union

def del_dir(target: Union[Path, str], only_if_empty: bool = False):
    target = Path(target).expanduser()
    assert target.is_dir()
    for p in sorted(target.glob('**/*'), reverse=True):
        if not p.exists():
            continue
        p.chmod(0o666)
        if p.is_dir():
            p.rmdir()
        else:
            if only_if_empty:
                raise RuntimeError(f'{p.parent} is not empty!')
            p.unlink()
    target.rmdir()

这取决于Path可排序的事实，较长的路径总是会在较短的路径之后排序，就像str。因此，目录将位于文件之前。如果我们反转排序，那么文件将位于它们各自的容器之前，因此我们可以简单地一遍一遍地取消链接/ rmdir文件。

优点：

它不依赖于外部二进制文件：所有内容都使用Python的电池模块（Python> = 3.6）
快速且内存高效：无需递归堆栈，无需启动子进程
它是跨平台的（至少，这就是pathlibPython 3.6 中的承诺；上述所有操作都说不能在Windows上运行）
如果需要，可以进行非常精细的日志记录，例如，记录每次删除的发生。

I’d like to add a “pure pathlib” approach:

from pathlib import Path
from typing import Union

def del_dir(target: Union[Path, str], only_if_empty: bool = False):
    target = Path(target).expanduser()
    assert target.is_dir()
    for p in sorted(target.glob('**/*'), reverse=True):
        if not p.exists():
            continue
        p.chmod(0o666)
        if p.is_dir():
            p.rmdir()
        else:
            if only_if_empty:
                raise RuntimeError(f'{p.parent} is not empty!')
            p.unlink()
    target.rmdir()

This relies on the fact that Path is orderable, and longer paths will always sort after shorter paths, just like str. Therefore, directories will come before files. If we reverse the sort, files will then come before their respective containers, so we can simply unlink/rmdir them one by one with one pass.

Benefits:

It’s NOT relying on external binaries: everything uses Python’s batteries-included modules (Python >= 3.6)
It’s fast and memory-efficient: No recursion stack, no need to start a subprocess
It’s cross-platform (at least, that’s what pathlib promises in Python 3.6; no operation above stated to not run on Windows)
If needed, one can do a very granular logging, e.g., log each deletion as it happens.

回答 10

def deleteDir(dirPath):
    deleteFiles = []
    deleteDirs = []
    for root, dirs, files in os.walk(dirPath):
        for f in files:
            deleteFiles.append(os.path.join(root, f))
        for d in dirs:
            deleteDirs.append(os.path.join(root, d))
    for f in deleteFiles:
        os.remove(f)
    for d in deleteDirs:
        os.rmdir(d)
    os.rmdir(dirPath)

def deleteDir(dirPath):
    deleteFiles = []
    deleteDirs = []
    for root, dirs, files in os.walk(dirPath):
        for f in files:
            deleteFiles.append(os.path.join(root, f))
        for d in dirs:
            deleteDirs.append(os.path.join(root, d))
    for f in deleteFiles:
        os.remove(f)
    for d in deleteDirs:
        os.rmdir(d)
    os.rmdir(dirPath)

回答 11

如果您不想使用该shutil模块，则可以使用该os模块。

from os import listdir, rmdir, remove
for i in listdir(directoryToRemove):
    os.remove(os.path.join(directoryToRemove, i))
rmdir(directoryToRemove) # Now the directory is empty of files

If you don’t want to use the shutil module you can just use the os module.

from os import listdir, rmdir, remove
for i in listdir(directoryToRemove):
    os.remove(os.path.join(directoryToRemove, i))
rmdir(directoryToRemove) # Now the directory is empty of files

回答 12

十年后，使用Python 3.7和Linux仍然有不同的方法：

import subprocess
from pathlib import Path

#using pathlib.Path
path = Path('/path/to/your/dir')
subprocess.run(["rm", "-rf", str(path)])

#using strings
path = "/path/to/your/dir"
subprocess.run(["rm", "-rf", path])

本质上，它是使用Python的子进程模块来运行bash脚本，$ rm -rf '/path/to/your/dir就像使用终端来完成相同的任务一样。它不是完全Python，但是可以完成。

我包含该pathlib.Path示例的原因是因为根据我的经验，在处理许多变化的路径时，它非常有用。导入pathlib.Path模块并将最终结果转换为字符串的额外步骤通常对我来说是较低的开发时间成本。如果Path.rmdir()带有arg选项来显式处理非空dirs ，将很方便。

Ten years later and using Python 3.7 and Linux there are still different ways to do this:

import subprocess
from pathlib import Path

#using pathlib.Path
path = Path('/path/to/your/dir')
subprocess.run(["rm", "-rf", str(path)])

#using strings
path = "/path/to/your/dir"
subprocess.run(["rm", "-rf", path])

Essentially it’s using Python’s subprocess module to run the bash script $ rm -rf '/path/to/your/dir as if you were using the terminal to accomplish the same task. It’s not fully Python, but it gets it done.

The reason I included the pathlib.Path example is because in my experience it’s very useful when dealing with many paths that change. The extra steps of importing the pathlib.Path module and converting the end results to strings is often a lower cost to me for development time. It would be convenient if Path.rmdir() came with an arg option to explicitly handle non-empty dirs.

回答 13

即使一个文件夹可能不存在，也要删除该文件夹（避免使用Charles Chow的竞价条件），但在其他情况出错（例如权限问题，磁盘读取错误，该文件不是目录）时仍然存在错误

对于Python 3.x：

import shutil

def ignore_absent_file(func, path, exc_inf):
    except_instance = exc_inf[1]
    if isinstance(except_instance, FileNotFoundError):
        return
    raise except_instance

shutil.rmtree(dir_to_delete, onerror=ignore_absent_file)

Python 2.7代码几乎相同：

import shutil
import errno

def ignore_absent_file(func, path, exc_inf):
    except_instance = exc_inf[1]
    if isinstance(except_instance, OSError) and \
        except_instance.errno == errno.ENOENT:
        return
    raise except_instance

shutil.rmtree(dir_to_delete, onerror=ignore_absent_file)

To delete a folder even if it might not exist (avoiding the race condition in Charles Chow’s answer) but still have errors when other things go wrong (e.g. permission problems, disk read error, the file isn’t a directory)

For Python 3.x:

import shutil

def ignore_absent_file(func, path, exc_inf):
    except_instance = exc_inf[1]
    if isinstance(except_instance, FileNotFoundError):
        return
    raise except_instance

shutil.rmtree(dir_to_delete, onerror=ignore_absent_file)

The Python 2.7 code is almost the same:

import shutil
import errno

def ignore_absent_file(func, path, exc_inf):
    except_instance = exc_inf[1]
    if isinstance(except_instance, OSError) and \
        except_instance.errno == errno.ENOENT:
        return
    raise except_instance

shutil.rmtree(dir_to_delete, onerror=ignore_absent_file)

回答 14

使用os.walk，我将提出包含3个单行Python调用的解决方案：

python -c "import sys; import os; [os.chmod(os.path.join(rs,d), 0o777) for rs,ds,fs in os.walk(_path_) for d in ds]"
python -c "import sys; import os; [os.chmod(os.path.join(rs,f), 0o777) for rs,ds,fs in os.walk(_path_) for f in fs]"
python -c "import os; import shutil; shutil.rmtree(_path_, ignore_errors=False)"

第一个脚本chmod的所有子目录，第二个脚本chmod的所有文件。然后，第三个脚本将无障碍地删除所有内容。

我已经在Jenkins作业中的“ Shell脚本”中对此进行了测试（我不想将新的Python脚本存储到SCM中，这就是为什么要搜索单行解决方案的原因），并且它适用于Linux和Windows。

With os.walk I would propose the solution which consists of 3 one-liner Python calls:

python -c "import sys; import os; [os.chmod(os.path.join(rs,d), 0o777) for rs,ds,fs in os.walk(_path_) for d in ds]"
python -c "import sys; import os; [os.chmod(os.path.join(rs,f), 0o777) for rs,ds,fs in os.walk(_path_) for f in fs]"
python -c "import os; import shutil; shutil.rmtree(_path_, ignore_errors=False)"

The first script chmod’s all sub-directories, the second script chmod’s all files. Then the third script removes everything with no impediments.

I have tested this from the “Shell Script” in a Jenkins job (I did not want to store a new Python script into SCM, that’s why searched for a one-line solution) and it worked for Linux and Windows.

回答 15

为了简单起见，可以使用os.system命令：

import os
os.system("rm -rf dirname")

显而易见，它实际上调用系统终端来完成此任务。

You can use os.system command for simplicity:

import os
os.system("rm -rf dirname")

As obvious, it actually invokes system terminal to accomplish this task.

回答 16

我发现一种非常简单的方法来删除WINDOWS OS上的任何文件夹（甚至不为空）或文件。

os.system('powershell.exe  rmdir -r D:\workspace\Branches\*%s* -Force' %CANDIDATE_BRANCH)

I have found a very easy way to Delete any folder(Even NOT Empty) or file on WINDOWS OS.

os.system('powershell.exe  rmdir -r D:\workspace\Branches\*%s* -Force' %CANDIDATE_BRANCH)

回答 17

对于Windows，如果目录不为空，并且您具有只读文件，或者出现诸如

Access is denied
The process cannot access the file because it is being used by another process

尝试这个， os.system('rmdir /S /Q "{}"'.format(directory))

rm -rf在Linux / Mac中等效。

For Windows, if directory is not empty, and you have read-only files or you get errors like

Access is denied
The process cannot access the file because it is being used by another process

Try this, os.system('rmdir /S /Q "{}"'.format(directory))

It’s equivalent for rm -rf in Linux/Mac.

知识问答

如何在Python中检查文件大小？

2021年7月25日 Python实用宝典

问题：如何在Python中检查文件大小？

我在Windows中编写Python脚本。我想根据文件大小做一些事情。例如，如果大小大于0，我将向某人发送电子邮件，否则继续其他操作。

如何检查文件大小？

I am writing a Python script in Windows. I want to do something based on the file size. For example, if the size is greater than 0, I will send an email to somebody, otherwise continue to other things.

How do I check the file size?

回答 0

您需要由返回的对象的st_size属性。您可以使用（Python 3.4+）来获取它：os.statpathlib

>>> from pathlib import Path
>>> Path('somefile.txt').stat()
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> Path('somefile.txt').stat().st_size
1564

或使用os.stat：

>>> import os
>>> os.stat('somefile.txt')
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> os.stat('somefile.txt').st_size
1564

输出以字节为单位。

You need the st_size property of the object returned by os.stat. You can get it by either using pathlib (Python 3.4+):

>>> from pathlib import Path
>>> Path('somefile.txt').stat()
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> Path('somefile.txt').stat().st_size
1564

or using os.stat:

>>> import os
>>> os.stat('somefile.txt')
os.stat_result(st_mode=33188, st_ino=6419862, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=1564, st_atime=1584299303, st_mtime=1584299400, st_ctime=1584299400)
>>> os.stat('somefile.txt').st_size
1564

Output is in bytes.

回答 1

使用os.path.getsize：

>>> import os
>>> b = os.path.getsize("/path/isa_005.mp3")
>>> b
2071611

输出以字节为单位。

Using os.path.getsize:

>>> import os
>>> b = os.path.getsize("/path/isa_005.mp3")
>>> b
2071611

The output is in bytes.

回答 2

其他答案适用于实际文件，但是如果您需要适用于“类文件的对象”的文件，请尝试以下操作：

# f is a file-like object. 
f.seek(0, os.SEEK_END)
size = f.tell()

在我有限的测试中，它适用于真实文件和StringIO。（Python 2.7.3。）当然，“类文件对象” API并不是严格的接口，但是API文档建议类文件对象应支持seek()和tell()。

编辑

这与之间的另一个区别os.stat()是，stat()即使您没有读取权限，也可以文件。显然，除非您具有阅读许可，否则搜索/讲述方法将无法工作。

编辑2

在乔纳森的建议下，这是一个偏执的版本。（以上版本将文件指针留在文件的末尾，因此，如果您尝试从文件中读取文件，则将返回零字节！）

# f is a file-like object. 
old_file_position = f.tell()
f.seek(0, os.SEEK_END)
size = f.tell()
f.seek(old_file_position, os.SEEK_SET)

The other answers work for real files, but if you need something that works for “file-like objects”, try this:

# f is a file-like object. 
f.seek(0, os.SEEK_END)
size = f.tell()

It works for real files and StringIO’s, in my limited testing. (Python 2.7.3.) The “file-like object” API isn’t really a rigorous interface, of course, but the API documentation suggests that file-like objects should support seek() and tell().

Edit

Another difference between this and os.stat() is that you can stat() a file even if you don’t have permission to read it. Obviously the seek/tell approach won’t work unless you have read permission.

Edit 2

At Jonathon’s suggestion, here’s a paranoid version. (The version above leaves the file pointer at the end of the file, so if you were to try to read from the file, you’d get zero bytes back!)

# f is a file-like object. 
old_file_position = f.tell()
f.seek(0, os.SEEK_END)
size = f.tell()
f.seek(old_file_position, os.SEEK_SET)

回答 3

import os


def convert_bytes(num):
    """
    this function will convert bytes to MB.... GB... etc
    """
    for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
        if num < 1024.0:
            return "%3.1f %s" % (num, x)
        num /= 1024.0


def file_size(file_path):
    """
    this function will return the file size
    """
    if os.path.isfile(file_path):
        file_info = os.stat(file_path)
        return convert_bytes(file_info.st_size)


# Lets check the file size of MS Paint exe 
# or you can use any file path
file_path = r"C:\Windows\System32\mspaint.exe"
print file_size(file_path)

结果：

6.1 MB

import os


def convert_bytes(num):
    """
    this function will convert bytes to MB.... GB... etc
    """
    for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
        if num < 1024.0:
            return "%3.1f %s" % (num, x)
        num /= 1024.0


def file_size(file_path):
    """
    this function will return the file size
    """
    if os.path.isfile(file_path):
        file_info = os.stat(file_path)
        return convert_bytes(file_info.st_size)


# Lets check the file size of MS Paint exe 
# or you can use any file path
file_path = r"C:\Windows\System32\mspaint.exe"
print file_size(file_path)

Result:

6.1 MB

回答 4

使用pathlib（在Python 3.4中添加或在PyPI上提供的反向端口）：

from pathlib import Path
file = Path() / 'doc.txt'  # or Path('./doc.txt')
size = file.stat().st_size

实际上，这只是一个接口os.stat，但是使用pathlib提供了一种访问其他文件相关操作的简便方法。

Using pathlib (added in Python 3.4 or a backport available on PyPI):

from pathlib import Path
file = Path() / 'doc.txt'  # or Path('./doc.txt')
size = file.stat().st_size

This is really only an interface around os.stat, but using pathlib provides an easy way to access other file related operations.

回答 5

bitshift如果要从转换bytes为任何其他单位，有一个技巧。如果您进行右移，则10基本上是按一个顺序（多个）进行移位。

例： 5GB are 5368709120 bytes

print (5368709120 >> 10)  # 5242880 kilobytes (kB)
print (5368709120 >> 20 ) # 5120 megabytes (MB)
print (5368709120 >> 30 ) # 5 gigabytes (GB)

There is a bitshift trick I use if I want to to convert from bytes to any other unit. If you do a right shift by 10 you basically shift it by an order (multiple).

Example: 5GB are 5368709120 bytes

print (5368709120 >> 10)  # 5242880 kilobytes (kB)
print (5368709120 >> 20 ) # 5120 megabytes (MB)
print (5368709120 >> 30 ) # 5 gigabytes (GB)

回答 6

严格遵循这个问题，Python代码（+伪代码）将是：

import os
file_path = r"<path to your file>"
if os.stat(file_path).st_size > 0:
    <send an email to somebody>
else:
    <continue to other things>

Strictly sticking to the question, the Python code (+ pseudo-code) would be:

import os
file_path = r"<path to your file>"
if os.stat(file_path).st_size > 0:
    <send an email to somebody>
else:
    <continue to other things>

回答 7

#Get file size , print it , process it...
#Os.stat will provide the file size in (.st_size) property. 
#The file size will be shown in bytes.

import os

fsize=os.stat('filepath')
print('size:' + fsize.st_size.__str__())

#check if the file size is less than 10 MB

if fsize.st_size < 10000000:
    process it ....

#Get file size , print it , process it...
#Os.stat will provide the file size in (.st_size) property. 
#The file size will be shown in bytes.

import os

fsize=os.stat('filepath')
print('size:' + fsize.st_size.__str__())

#check if the file size is less than 10 MB

if fsize.st_size < 10000000:
    process it ....

回答 8

我们有两个选择都包括导入os模块

1）作为os.stat（）函数导入os返回一个对象，该对象包含许多标头，包括文件创建时间和上次修改时间等。其中st_size（）给出文件的确切大小。

os.stat（“文件名”）.st_size（）

2）import os在此，我们必须提供确切的文件路径（绝对路径），而不是相对路径。

os.path.getsize（“文件路径”）

we have two options Both include importing os module

1) import os as os.stat() function returns an object which contains so many headers including file created time and last modified time etc.. among them st_size() gives the exact size of the file.

os.stat(“filename”).st_size()

2) import os In this, we have to provide the exact file path(absolute path), not a relative path.

os.path.getsize(“path of file”)

知识问答

使用Python将列表写入文件

2021年7月25日 Python实用宝典

问题：使用Python将列表写入文件

因为writelines()不插入换行符，这是将列表写入文件的最干净的方法吗？

file.writelines(["%s\n" % item  for item in list])

似乎会有一种标准的方法…

Is this the cleanest way to write a list to a file, since writelines() doesn’t insert newline characters?

file.writelines(["%s\n" % item  for item in list])

It seems like there would be a standard way…

回答 0

您可以使用循环：

with open('your_file.txt', 'w') as f:
    for item in my_list:
        f.write("%s\n" % item)

在Python 2中，您也可以使用

with open('your_file.txt', 'w') as f:
    for item in my_list:
        print >> f, item

如果您热衷于单个函数调用，请至少移除方括号[]，以使要打印的字符串一次生成一个（一个genexp而不是一个listcomp）-没有理由占用所有需要的内存具体化整个字符串列表。

You can use a loop:

with open('your_file.txt', 'w') as f:
    for item in my_list:
        f.write("%s\n" % item)

In Python 2, you can also use

with open('your_file.txt', 'w') as f:
    for item in my_list:
        print >> f, item

If you’re keen on a single function call, at least remove the square brackets [], so that the strings to be printed get made one at a time (a genexp rather than a listcomp) — no reason to take up all the memory required to materialize the whole list of strings.

回答 1

您将如何处理该文件？该文件是否存在于人类或具有明确互操作性要求的其他程序中？

如果您只是尝试将列表序列化到磁盘以供同一python应用程序稍后使用，则应该对列表进行腌制。

import pickle

with open('outfile', 'wb') as fp:
    pickle.dump(itemlist, fp)

读回：

with open ('outfile', 'rb') as fp:
    itemlist = pickle.load(fp)

What are you going to do with the file? Does this file exist for humans, or other programs with clear interoperability requirements?

If you are just trying to serialize a list to disk for later use by the same python app, you should be pickleing the list.

import pickle

with open('outfile', 'wb') as fp:
    pickle.dump(itemlist, fp)

To read it back:

with open ('outfile', 'rb') as fp:
    itemlist = pickle.load(fp)

回答 2

比较简单的方法是：

with open("outfile", "w") as outfile:
    outfile.write("\n".join(itemlist))

您可以使用生成器表达式来确保项目列表中的所有项目都是字符串：

with open("outfile", "w") as outfile:
    outfile.write("\n".join(str(item) for item in itemlist))

请记住，所有itemlist列表都必须在内存中，因此，请注意内存消耗。

The simpler way is:

with open("outfile", "w") as outfile:
    outfile.write("\n".join(itemlist))

You could ensure that all items in item list are strings using a generator expression:

with open("outfile", "w") as outfile:
    outfile.write("\n".join(str(item) for item in itemlist))

Remember that all itemlist list need to be in memory, so, take care about the memory consumption.

回答 3

使用Python 3和Python 2.6+语法：

with open(filepath, 'w') as file_handler:
    for item in the_list:
        file_handler.write("{}\n".format(item))

这是与平台无关的。它还以换行符结束最后一行，这是UNIX的最佳实践。

从Python 3.6开始，"{}\n".format(item)可以用f字符串替换：f"{item}\n"。

Using Python 3 and Python 2.6+ syntax:

with open(filepath, 'w') as file_handler:
    for item in the_list:
        file_handler.write("{}\n".format(item))

This is platform-independent. It also terminates the final line with a newline character, which is a UNIX best practice.

Starting with Python 3.6, "{}\n".format(item) can be replaced with an f-string: f"{item}\n".

回答 4

还有另一种方式。使用simplejson（在python 2.6中包含为json）序列化为json：

>>> import simplejson
>>> f = open('output.txt', 'w')
>>> simplejson.dump([1,2,3,4], f)
>>> f.close()

如果您检查output.txt：

[1、2、3、4]

这很有用，因为语法是pythonic的，它是人类可读的，并且可以由其他语言的其他程序读取。

Yet another way. Serialize to json using simplejson (included as json in python 2.6):

>>> import simplejson
>>> f = open('output.txt', 'w')
>>> simplejson.dump([1,2,3,4], f)
>>> f.close()

If you examine output.txt:

[1, 2, 3, 4]

This is useful because the syntax is pythonic, it’s human readable, and it can be read by other programs in other languages.

回答 5

我认为探索使用genexp的好处会很有趣，所以这是我的看法。

问题中的示例使用方括号创建临时列表，因此等效于：

file.writelines( list( "%s\n" % item for item in list ) )

它不必要地构造了将要写出的所有行的临时列表，这可能会消耗大量内存，具体取决于列表的大小以及输出的详细str(item)程度。

放下方括号（相当于删除list()上面的包装调用）将改为将临时生成器传递给file.writelines()：

file.writelines( "%s\n" % item for item in list )

该生成器将item按需创建换行终止的对象表示形式（即，当对象被写出时）。这样做有几个方面的好处：

内存开销很小，即使列表很大
如果str(item)速度较慢，则在处理每个项目时文件中都有可见的进度

这样可以避免出现内存问题，例如：

In [1]: import os

In [2]: f = file(os.devnull, "w")

In [3]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 385 ms per loop

In [4]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.

Traceback (most recent call last):
...
MemoryError

（通过，我通过将Python的最大虚拟内存限制为〜100MB触发了此错误ulimit -v 102400）。

一方面，此方法实际上并没有比原始方法快：

In [4]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 370 ms per loop

In [5]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
1 loops, best of 3: 360 ms per loop

（Linux上的Python 2.6.2）

I thought it would be interesting to explore the benefits of using a genexp, so here’s my take.

The example in the question uses square brackets to create a temporary list, and so is equivalent to:

file.writelines( list( "%s\n" % item for item in list ) )

Which needlessly constructs a temporary list of all the lines that will be written out, this may consume significant amounts of memory depending on the size of your list and how verbose the output of str(item) is.

Drop the square brackets (equivalent to removing the wrapping list() call above) will instead pass a temporary generator to file.writelines():

file.writelines( "%s\n" % item for item in list )

This generator will create newline-terminated representation of your item objects on-demand (i.e. as they are written out). This is nice for a couple of reasons:

Memory overheads are small, even for very large lists
If str(item) is slow there’s visible progress in the file as each item is processed

This avoids memory issues, such as:

In [1]: import os

In [2]: f = file(os.devnull, "w")

In [3]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 385 ms per loop

In [4]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.

Traceback (most recent call last):
...
MemoryError

(I triggered this error by limiting Python’s max. virtual memory to ~100MB with ulimit -v 102400).

Putting memory usage to one side, this method isn’t actually any faster than the original:

In [4]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 370 ms per loop

In [5]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
1 loops, best of 3: 360 ms per loop

(Python 2.6.2 on Linux)

回答 6

因为我很懒…

import json
a = [1,2,3]
with open('test.txt', 'w') as f:
    f.write(json.dumps(a))

#Now read the file back into a Python list object
with open('test.txt', 'r') as f:
    a = json.loads(f.read())

Because i’m lazy….

import json
a = [1,2,3]
with open('test.txt', 'w') as f:
    f.write(json.dumps(a))

#Now read the file back into a Python list object
with open('test.txt', 'r') as f:
    a = json.loads(f.read())

回答 7

将列表序列化为带有逗号分隔值的文本文件

mylist = dir()
with open('filename.txt','w') as f:
    f.write( ','.join( mylist ) )

Serialize list into text file with comma sepparated value

mylist = dir()
with open('filename.txt','w') as f:
    f.write( ','.join( mylist ) )

回答 8

一般来说

以下是writelines（）方法的语法

fileObject.writelines( sequence )

例

#!/usr/bin/python

# Open a file
fo = open("foo.txt", "rw+")
seq = ["This is 6th line\n", "This is 7th line"]

# Write sequence of lines at the end of the file.
line = fo.writelines( seq )

# Close opend file
fo.close()

参考

http://www.tutorialspoint.com/python/file_writelines.htm

In General

Following is the syntax for writelines() method

fileObject.writelines( sequence )

Example

#!/usr/bin/python

# Open a file
fo = open("foo.txt", "rw+")
seq = ["This is 6th line\n", "This is 7th line"]

# Write sequence of lines at the end of the file.
line = fo.writelines( seq )

# Close opend file
fo.close()

Reference

http://www.tutorialspoint.com/python/file_writelines.htm

回答 9

file.write('\n'.join(list))

file.write('\n'.join(list))

回答 10

with open ("test.txt","w")as fp:
   for line in list12:
       fp.write(line+"\n")

with open ("test.txt","w")as fp:
   for line in list12:
       fp.write(line+"\n")

回答 11

如果您使用的是python3，则还可以使用print函数，如下所示。

f = open("myfile.txt","wb")
print(mylist, file=f)

You can also use the print function if you’re on python3 as follows.

f = open("myfile.txt","wb")
print(mylist, file=f)

回答 12

你为什么不尝试

file.write(str(list))

Why don’t you try

file.write(str(list))

回答 13

此逻辑将首先将list中的项目转换为string(str)。有时列表中包含一个元组，例如

alist = [(i12,tiger), 
(113,lion)]

此逻辑将在新行中写入文件每个元组。我们稍后可以eval在读取文件时加载每个元组时使用：

outfile = open('outfile.txt', 'w') # open a file in write mode
for item in list_to_persistence:    # iterate over the list items
   outfile.write(str(item) + '\n') # write to the file
outfile.close()   # close the file

This logic will first convert the items in list to string(str). Sometimes the list contains a tuple like

alist = [(i12,tiger), 
(113,lion)]

This logic will write to file each tuple in a new line. We can later use eval while loading each tuple when reading the file:

outfile = open('outfile.txt', 'w') # open a file in write mode
for item in list_to_persistence:    # iterate over the list items
   outfile.write(str(item) + '\n') # write to the file
outfile.close()   # close the file

回答 14

迭代和添加换行符的另一种方法：

for item in items:
    filewriter.write(f"{item}" + "\n")

Another way of iterating and adding newline:

for item in items:
    filewriter.write(f"{item}" + "\n")

回答 15

在python> 3中，您可以将print和*用于参数解包：

with open("fout.txt", "w") as fout:
    print(*my_list, sep="\n", file=fout)

In python>3 you can use print and * for argument unpacking:

with open("fout.txt", "w") as fout:
    print(*my_list, sep="\n", file=fout)

回答 16

在Python3中，您可以使用此循环

with open('your_file.txt', 'w') as f:
    for item in list:
        f.print("", item)

In Python3 You Can use this loop

with open('your_file.txt', 'w') as f:
    for item in list:
        f.print("", item)

回答 17

让avg作为列表，然后：

In [29]: a = n.array((avg))
In [31]: a.tofile('avgpoints.dat',sep='\n',dtype = '%f')

您可以根据需要使用%e或%s。

Let avg be the list, then:

In [29]: a = n.array((avg))
In [31]: a.tofile('avgpoints.dat',sep='\n',dtype = '%f')

You can use %e or %s depending on your requirement.

回答 18

poem = '''\
Programming is fun
When the work is done
if you wanna make your work also fun:
use Python!
'''
f = open('poem.txt', 'w') # open for 'w'riting
f.write(poem) # write text to file
f.close() # close the file

工作原理：首先，使用内置的打开功能打开文件，并指定文件名称和我们要打开文件的方式。该模式可以是读取模式（’r’），写入模式（’w’）或追加模式（’a’）。我们还可以指定是以文本模式（’t’）还是二进制模式（’b’）阅读，书写或追加内容。实际上，还有更多可用的模式，help（open）将为您提供有关它们的更多详细信息。默认情况下，open（）将文件视为“ t”扩展文件，并以“ r’ead”模式将其打开。在我们的示例中，我们首先以写文本模式打开文件，然后使用文件对象的write方法写入文件，然后最终关闭文件。

上面的示例来自Swaroop C H. swaroopch.com 的书“ A Byte of Python”。

poem = '''\
Programming is fun
When the work is done
if you wanna make your work also fun:
use Python!
'''
f = open('poem.txt', 'w') # open for 'w'riting
f.write(poem) # write text to file
f.close() # close the file

How It Works: First, open a ﬁle by using the built-in open function and specifying the name of the ﬁle and the mode in which we want to open the ﬁle. The mode can be a read mode (’r’), write mode (’w’) or append mode (’a’). We can also specify whether we are reading, writing, or appending in text mode (’t’) or binary mode (’b’). There are actually many more modes available and help(open) will give you more details about them. By default, open() considers the ﬁle to be a ’t’ext ﬁle and opens it in ’r’ead mode. In our example, we ﬁrst open the ﬁle in write text mode and use the write method of the ﬁle object to write to the ﬁle and then we ﬁnally close the ﬁle.

The above example is from the book “A Byte of Python” by Swaroop C H. swaroopch.com

知识问答

如何在Python中复制文件？

2021年7月24日 Python实用宝典

问题：如何在Python中复制文件？

如何在Python中复制文件？

我找不到任何东西os。

How do I copy a file in Python?

I couldn’t find anything under os.

回答 0

shutil有很多方法可以使用。其中之一是：

from shutil import copyfile
copyfile(src, dst)

将名为src的文件的内容复制到名为dst的文件。
目标位置必须可写；否则，将引发IOError异常。
如果dst已经存在，它将被替换。
特殊文件（例如字符或块设备和管道）无法使用此功能进行复制。
对于copy，src和dst是作为字符串给出的路径名。

如果使用os.path操作，请使用copy而不是copyfile。copyfile将只接受字符串。

shutil has many methods you can use. One of which is:

from shutil import copyfile
copyfile(src, dst)

Copy the contents of the file named src to a file named dst.
The destination location must be writable; otherwise, an IOError exception will be raised.
If dst already exists, it will be replaced.
Special files such as character or block devices and pipes cannot be copied with this function.
With copy, src and dst are path names given as strings.

If you use os.path operations, use copy rather than copyfile. copyfile will only accept strings.

回答 1

┌──────────────────┬────────┬───────────┬───────┬────────────────┐
│     Function     │ Copies │   Copies  │Can use│   Destination  │
│                  │metadata│permissions│buffer │may be directory│
├──────────────────┼────────┼───────────┼───────┼────────────────┤
│shutil.copy       │   No   │    Yes    │   No  │      Yes       │
│shutil.copyfile   │   No   │     No    │   No  │       No       │
│shutil.copy2      │  Yes   │    Yes    │   No  │      Yes       │
│shutil.copyfileobj│   No   │     No    │  Yes  │       No       │
└──────────────────┴────────┴───────────┴───────┴────────────────┘

┌──────────────────┬────────┬───────────┬───────┬────────────────┐
│     Function     │ Copies │   Copies  │Can use│   Destination  │
│                  │metadata│permissions│buffer │may be directory│
├──────────────────┼────────┼───────────┼───────┼────────────────┤
│shutil.copy       │   No   │    Yes    │   No  │      Yes       │
│shutil.copyfile   │   No   │     No    │   No  │       No       │
│shutil.copy2      │  Yes   │    Yes    │   No  │      Yes       │
│shutil.copyfileobj│   No   │     No    │  Yes  │       No       │
└──────────────────┴────────┴───────────┴───────┴────────────────┘

回答 2

copy2(src,dst)通常比以下copyfile(src,dst)原因更有用：

它允许dst将一个目录（而不是完整的目标文件名），在这种情况下，基本名称的src用于创建新的文件;
它将原始修改和访问信息（mtime和atime）保留在文件元数据中（但是，这会带来一些开销）。

这是一个简短的示例：

import shutil
shutil.copy2('/src/dir/file.ext', '/dst/dir/newname.ext') # complete target filename given
shutil.copy2('/src/file.ext', '/dst/dir') # target filename is /dst/dir/file.ext

copy2(src,dst) is often more useful than copyfile(src,dst) because:

it allows dst to be a directory (instead of the complete target filename), in which case the basename of src is used for creating the new file;
it preserves the original modification and access info (mtime and atime) in the file metadata (however, this comes with a slight overhead).

Here is a short example:

import shutil
shutil.copy2('/src/dir/file.ext', '/dst/dir/newname.ext') # complete target filename given
shutil.copy2('/src/file.ext', '/dst/dir') # target filename is /dst/dir/file.ext

回答 3

您可以使用shutil软件包中的一种复制功能：

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━
功能保留支持接受复制其他
                      权限目录目的。文件obj元数据  
―――――――――――――――――――――――――――――――――――――――――――― ――――――――――――――――――――――――――――
shutil.copy               ✔✔☐☐
 shutil.copy2              ✔✔☐✔
 shutil.copyfile           ☐☐☐☐
 shutil.copyfileobj        ☐☐✔☐
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━

例：

import shutil
shutil.copy('/etc/hostname', '/var/tmp/testhostname')

You can use one of the copy functions from the shutil package:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Function              preserves     supports          accepts     copies other
                      permissions   directory dest.   file obj    metadata  
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
shutil.copy              ✔             ✔                 ☐           ☐
shutil.copy2             ✔             ✔                 ☐           ✔
shutil.copyfile          ☐             ☐                 ☐           ☐
shutil.copyfileobj       ☐             ☐                 ✔           ☐
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Example:

import shutil
shutil.copy('/etc/hostname', '/var/tmp/testhostname')

回答 4

在Python中，您可以使用

shutil 模组
os 模组
subprocess 模组

import os
import shutil
import subprocess

1）使用`shutil`模块复制文件

shutil.copyfile 签名

shutil.copyfile(src_file, dest_file, *, follow_symlinks=True)

# example    
shutil.copyfile('source.txt', 'destination.txt')

shutil.copy 签名

shutil.copy(src_file, dest_file, *, follow_symlinks=True)

# example
shutil.copy('source.txt', 'destination.txt')

shutil.copy2 签名

shutil.copy2(src_file, dest_file, *, follow_symlinks=True)

# example
shutil.copy2('source.txt', 'destination.txt')

shutil.copyfileobj 签名

shutil.copyfileobj(src_file_object, dest_file_object[, length])

# example
file_src = 'source.txt'  
f_src = open(file_src, 'rb')

file_dest = 'destination.txt'  
f_dest = open(file_dest, 'wb')

shutil.copyfileobj(f_src, f_dest)

2）使用`os`模块复制文件

os.popen 签名

os.popen(cmd[, mode[, bufsize]])

# example
# In Unix/Linux
os.popen('cp source.txt destination.txt') 

# In Windows
os.popen('copy source.txt destination.txt')

os.system 签名

os.system(command)


# In Linux/Unix
os.system('cp source.txt destination.txt')  

# In Windows
os.system('copy source.txt destination.txt')

3）使用`subprocess`模块复制文件

subprocess.call 签名

subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)

# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.call('cp source.txt destination.txt', shell=True) 

# In Windows
status = subprocess.call('copy source.txt destination.txt', shell=True)

subprocess.check_output 签名

subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)

# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.check_output('cp source.txt destination.txt', shell=True)

# In Windows
status = subprocess.check_output('copy source.txt destination.txt', shell=True)

In Python, you can copy the files using

shutil module
os module
subprocess module

import os
import shutil
import subprocess

1) Copying files using `shutil` module

shutil.copyfile signature

shutil.copyfile(src_file, dest_file, *, follow_symlinks=True)

# example    
shutil.copyfile('source.txt', 'destination.txt')

shutil.copy signature

shutil.copy(src_file, dest_file, *, follow_symlinks=True)

# example
shutil.copy('source.txt', 'destination.txt')

shutil.copy2 signature

shutil.copy2(src_file, dest_file, *, follow_symlinks=True)

# example
shutil.copy2('source.txt', 'destination.txt')

shutil.copyfileobj signature

shutil.copyfileobj(src_file_object, dest_file_object[, length])

# example
file_src = 'source.txt'  
f_src = open(file_src, 'rb')

file_dest = 'destination.txt'  
f_dest = open(file_dest, 'wb')

shutil.copyfileobj(f_src, f_dest)

2) Copying files using `os` module

os.popen signature

os.popen(cmd[, mode[, bufsize]])

# example
# In Unix/Linux
os.popen('cp source.txt destination.txt') 

# In Windows
os.popen('copy source.txt destination.txt')

os.system signature

os.system(command)


# In Linux/Unix
os.system('cp source.txt destination.txt')  

# In Windows
os.system('copy source.txt destination.txt')

3) Copying files using `subprocess` module

subprocess.call signature

subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)

# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.call('cp source.txt destination.txt', shell=True) 

# In Windows
status = subprocess.call('copy source.txt destination.txt', shell=True)

subprocess.check_output signature

subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)

# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.check_output('cp source.txt destination.txt', shell=True)

# In Windows
status = subprocess.check_output('copy source.txt destination.txt', shell=True)

回答 5

复制文件是一个相对简单的操作，如下面的示例所示，但是您应该为此使用shutil stdlib模块。

def copyfileobj_example(source, dest, buffer_size=1024*1024):
    """      
    Copy a file from source to dest. source and dest
    must be file-like objects, i.e. any object with a read or
    write method, like for example StringIO.
    """
    while True:
        copy_buffer = source.read(buffer_size)
        if not copy_buffer:
            break
        dest.write(copy_buffer)

如果要按文件名复制，可以执行以下操作：

def copyfile_example(source, dest):
    # Beware, this example does not handle any edge cases!
    with open(source, 'rb') as src, open(dest, 'wb') as dst:
        copyfileobj_example(src, dst)

Copying a file is a relatively straightforward operation as shown by the examples below, but you should instead use the shutil stdlib module for that.

def copyfileobj_example(source, dest, buffer_size=1024*1024):
    """      
    Copy a file from source to dest. source and dest
    must be file-like objects, i.e. any object with a read or
    write method, like for example StringIO.
    """
    while True:
        copy_buffer = source.read(buffer_size)
        if not copy_buffer:
            break
        dest.write(copy_buffer)

If you want to copy by filename you could do something like this:

def copyfile_example(source, dest):
    # Beware, this example does not handle any edge cases!
    with open(source, 'rb') as src, open(dest, 'wb') as dst:
        copyfileobj_example(src, dst)

回答 6

使用shutil模块。

copyfile(src, dst)

将名为src的文件的内容复制到名为dst的文件。目标位置必须可写；否则，将引发IOError异常。如果dst已经存在，它将被替换。特殊文件（例如字符或块设备和管道）无法使用此功能进行复制。src和dst是以字符串形式给出的路径名。

看一下filesys中标准Python模块中可用的所有文件和目录处理功能。

Use the shutil module.

copyfile(src, dst)

Copy the contents of the file named src to a file named dst. The destination location must be writable; otherwise, an IOError exception will be raised. If dst already exists, it will be replaced. Special files such as character or block devices and pipes cannot be copied with this function. src and dst are path names given as strings.

Take a look at filesys for all the file and directory handling functions available in standard Python modules.

回答 7

目录和文件复制示例-来自Tim Golden的Python资料：

http://timgolden.me.uk/python/win32_how_do_i/copy-a-file.html

import os
import shutil
import tempfile

filename1 = tempfile.mktemp (".txt")
open (filename1, "w").close ()
filename2 = filename1 + ".copy"
print filename1, "=>", filename2

shutil.copy (filename1, filename2)

if os.path.isfile (filename2): print "Success"

dirname1 = tempfile.mktemp (".dir")
os.mkdir (dirname1)
dirname2 = dirname1 + ".copy"
print dirname1, "=>", dirname2

shutil.copytree (dirname1, dirname2)

if os.path.isdir (dirname2): print "Success"

Directory and File copy example – From Tim Golden’s Python Stuff:

http://timgolden.me.uk/python/win32_how_do_i/copy-a-file.html

import os
import shutil
import tempfile

filename1 = tempfile.mktemp (".txt")
open (filename1, "w").close ()
filename2 = filename1 + ".copy"
print filename1, "=>", filename2

shutil.copy (filename1, filename2)

if os.path.isfile (filename2): print "Success"

dirname1 = tempfile.mktemp (".dir")
os.mkdir (dirname1)
dirname2 = dirname1 + ".copy"
print dirname1, "=>", dirname2

shutil.copytree (dirname1, dirname2)

if os.path.isdir (dirname2): print "Success"

回答 8

首先，我详尽介绍了shutil方法的摘要，供您参考。

shutil_methods =
{'copy':['shutil.copyfileobj',
          'shutil.copyfile',
          'shutil.copymode',
          'shutil.copystat',
          'shutil.copy',
          'shutil.copy2',
          'shutil.copytree',],
 'move':['shutil.rmtree',
         'shutil.move',],
 'exception': ['exception shutil.SameFileError',
                 'exception shutil.Error'],
 'others':['shutil.disk_usage',
             'shutil.chown',
             'shutil.which',
             'shutil.ignore_patterns',]
}

其次，解释示例中的复制方法：

shutil.copyfileobj(fsrc, fdst[, length]) 操作打开的对象

In [3]: src = '~/Documents/Head+First+SQL.pdf'
In [4]: dst = '~/desktop'
In [5]: shutil.copyfileobj(src, dst)
AttributeError: 'str' object has no attribute 'read'
#copy the file object
In [7]: with open(src, 'rb') as f1,open(os.path.join(dst,'test.pdf'), 'wb') as f2:
    ...:      shutil.copyfileobj(f1, f2)
In [8]: os.stat(os.path.join(dst,'test.pdf'))
Out[8]: os.stat_result(st_mode=33188, st_ino=8598319475, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067347, st_mtime=1516067335, st_ctime=1516067345)

shutil.copyfile(src, dst, *, follow_symlinks=True) 复制并重命名

In [9]: shutil.copyfile(src, dst)
IsADirectoryError: [Errno 21] Is a directory: ~/desktop'
#so dst should be a filename instead of a directory name

shutil.copy() 复制时不设置元数据

In [10]: shutil.copy(src, dst)
Out[10]: ~/desktop/Head+First+SQL.pdf'
#check their metadata
In [25]: os.stat(src)
Out[25]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066425, st_mtime=1493698739, st_ctime=1514871215)
In [26]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf'))
Out[26]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066427, st_mtime=1516066425, st_ctime=1516066425)
# st_atime,st_mtime,st_ctime changed

shutil.copy2() 保留元数据进行复制

In [30]: shutil.copy2(src, dst)
Out[30]: ~/desktop/Head+First+SQL.pdf'
In [31]: os.stat(src)
Out[31]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067055, st_mtime=1493698739, st_ctime=1514871215)
In [32]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf'))
Out[32]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067063, st_mtime=1493698739, st_ctime=1516067055)
# Preseved st_mtime

shutil.copytree()

以递归方式复制以src为根的整个目录树，返回目标目录

Firstly, I made an exhaustive cheatsheet of shutil methods for your reference.

shutil_methods =
{'copy':['shutil.copyfileobj',
          'shutil.copyfile',
          'shutil.copymode',
          'shutil.copystat',
          'shutil.copy',
          'shutil.copy2',
          'shutil.copytree',],
 'move':['shutil.rmtree',
         'shutil.move',],
 'exception': ['exception shutil.SameFileError',
                 'exception shutil.Error'],
 'others':['shutil.disk_usage',
             'shutil.chown',
             'shutil.which',
             'shutil.ignore_patterns',]
}

Secondly, explain methods of copy in exmaples:

shutil.copyfileobj(fsrc, fdst[, length]) manipulate opened objects

In [3]: src = '~/Documents/Head+First+SQL.pdf'
In [4]: dst = '~/desktop'
In [5]: shutil.copyfileobj(src, dst)
AttributeError: 'str' object has no attribute 'read'
#copy the file object
In [7]: with open(src, 'rb') as f1,open(os.path.join(dst,'test.pdf'), 'wb') as f2:
    ...:      shutil.copyfileobj(f1, f2)
In [8]: os.stat(os.path.join(dst,'test.pdf'))
Out[8]: os.stat_result(st_mode=33188, st_ino=8598319475, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067347, st_mtime=1516067335, st_ctime=1516067345)

shutil.copyfile(src, dst, *, follow_symlinks=True) Copy and rename

In [9]: shutil.copyfile(src, dst)
IsADirectoryError: [Errno 21] Is a directory: ~/desktop'
#so dst should be a filename instead of a directory name

shutil.copy() Copy without preseving the metadata

In [10]: shutil.copy(src, dst)
Out[10]: ~/desktop/Head+First+SQL.pdf'
#check their metadata
In [25]: os.stat(src)
Out[25]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066425, st_mtime=1493698739, st_ctime=1514871215)
In [26]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf'))
Out[26]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066427, st_mtime=1516066425, st_ctime=1516066425)
# st_atime,st_mtime,st_ctime changed

shutil.copy2() Copy with preseving the metadata

In [30]: shutil.copy2(src, dst)
Out[30]: ~/desktop/Head+First+SQL.pdf'
In [31]: os.stat(src)
Out[31]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067055, st_mtime=1493698739, st_ctime=1514871215)
In [32]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf'))
Out[32]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067063, st_mtime=1493698739, st_ctime=1516067055)
# Preseved st_mtime

shutil.copytree()

Recursively copy an entire directory tree rooted at src, returning the destination directory

回答 9

对于小文件并且仅使用python内置函数，可以使用以下单行代码：

with open(source, 'rb') as src, open(dest, 'wb') as dst: dst.write(src.read())

正如@maxschlepzig在下面的评论中提到的，对于文件太大或内存至关重要的应用程序，这不是最佳方法，因此应首选Swati的答案。

For small files and using only python built-ins, you can use the following one-liner:

with open(source, 'rb') as src, open(dest, 'wb') as dst: dst.write(src.read())

As @maxschlepzig mentioned in the comments below, this is not optimal way for applications where the file is too large or when memory is critical, thus Swati’s answer should be preferred.

回答 10

你可以用 os.system('cp nameoffilegeneratedbyprogram /otherdirectory/')

还是像我那样

os.system('cp '+ rawfile + ' rawdata.dat')

rawfile我在程序内部生成的名称在哪里。

这是仅Linux的解决方案

You could use os.system('cp nameoffilegeneratedbyprogram /otherdirectory/')

or as I did it,

os.system('cp '+ rawfile + ' rawdata.dat')

where rawfile is the name that I had generated inside the program.

This is a Linux only solution

回答 11

对于大文件，我所做的就是逐行读取文件并将每一行读入数组。然后，一旦数组达到特定大小，请将其附加到新文件中。

for line in open("file.txt", "r"):
    list.append(line)
    if len(list) == 1000000: 
        output.writelines(list)
        del list[:]

For large files, what I did was read the file line by line and read each line into an array. Then, once the array reached a certain size, append it to a new file.

for line in open("file.txt", "r"):
    list.append(line)
    if len(list) == 1000000: 
        output.writelines(list)
        del list[:]

回答 12

from subprocess import call
call("cp -p <file> <file>", shell=True)

from subprocess import call
call("cp -p <file> <file>", shell=True)

回答 13

从Python 3.5开始，您可以对小文件（例如：文本文件，小jpegs）执行以下操作：

from pathlib import Path

source = Path('../path/to/my/file.txt')
destination = Path('../path/where/i/want/to/store/it.txt')
destination.write_bytes(source.read_bytes())

write_bytes 将覆盖目的地位置的所有内容

As of Python 3.5 you can do the following for small files (ie: text files, small jpegs):

from pathlib import Path

source = Path('../path/to/my/file.txt')
destination = Path('../path/where/i/want/to/store/it.txt')
destination.write_bytes(source.read_bytes())

write_bytes will overwrite whatever was at the destination’s location

回答 14

open(destination, 'wb').write(open(source, 'rb').read())

在读取模式下打开源文件，并在写入模式下写入目标文件。

open(destination, 'wb').write(open(source, 'rb').read())

Open the source file in read mode, and write to destination file in write mode.

回答 15

Python提供了内置功能，可使用操作系统外壳程序实用程序轻松复制文件。

以下命令用于复制文件

shutil.copy(src,dst)

以下命令用于复制带有元数据信息的文件

shutil.copystat(src,dst)

Python provides in-built functions for easily copying files using the Operating System Shell utilities.

Following command is used to Copy File

shutil.copy(src,dst)

Following command is used to Copy File with MetaData Information

shutil.copystat(src,dst)

知识问答

如何将文件逐行读取到列表中？

2021年7月24日 Python实用宝典

问题：如何将文件逐行读取到列表中？

如何在Python中读取文件的每一行并将每一行作为元素存储在列表中？

我想逐行读取文件并将每行追加到列表的末尾。

How do I read every line of a file in Python and store each line as an element in a list?

I want to read the file line by line and append each line to the end of the list.

回答 0

with open(filename) as f:
    content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
content = [x.strip() for x in content]

with open(filename) as f:
    content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
content = [x.strip() for x in content]

回答 1

请参阅输入和输出：

with open('filename') as f:
    lines = f.readlines()

或通过删除换行符：

with open('filename') as f:
    lines = [line.rstrip() for line in f]

See Input and Ouput:

with open('filename') as f:
    lines = f.readlines()

or with stripping the newline character:

with open('filename') as f:
    lines = [line.rstrip() for line in f]

回答 2

这比必要的要明确，但是可以满足您的要求。

with open("file.txt") as file_in:
    lines = []
    for line in file_in:
        lines.append(line)

This is more explicit than necessary, but does what you want.

with open("file.txt") as file_in:
    lines = []
    for line in file_in:
        lines.append(line)

回答 3

这将从文件中产生行的“数组”。

lines = tuple(open(filename, 'r'))

open返回可以迭代的文件。遍历文件时，您将从该文件中获取行。tuple可以使用一个迭代器，并从赋予它的迭代器中实例化一个元组实例。lines是从文件行创建的元组。

This will yield an “array” of lines from the file.

lines = tuple(open(filename, 'r'))

open returns a file which can be iterated over. When you iterate over a file, you get the lines from that file. tuple can take an iterator and instantiate a tuple instance for you from the iterator that you give it. lines is a tuple created from the lines of the file.

回答 4

如果要\n包括在内：

with open(fname) as f:
    content = f.readlines()

如果你不想 \n包括：

with open(fname) as f:
    content = f.read().splitlines()

If you want the \n included:

with open(fname) as f:
    content = f.readlines()

If you do not want \n included:

with open(fname) as f:
    content = f.read().splitlines()

回答 5

根据Python的文件对象方法，将文本文件转换为a的最简单方法list是：

with open('file.txt') as f:
    my_list = list(f)

如果只需要遍历文本文件行，则可以使用：

with open('file.txt') as f:
    for line in f:
       ...

旧答案：

使用with和readlines()：

with open('file.txt') as f:
    lines = f.readlines()

如果您不关心关闭文件，则此单行代码有效：

lines = open('file.txt').readlines()

在传统的方法：

f = open('file.txt') # Open file on read mode
lines = f.read().split("\n") # Create a list containing all lines
f.close() # Close file

According to Python’s Methods of File Objects, the simplest way to convert a text file into a list is:

with open('file.txt') as f:
    my_list = list(f)

If you just need to iterate over the text file lines, you can use:

with open('file.txt') as f:
    for line in f:
       ...

Old answer:

Using with and readlines() :

with open('file.txt') as f:
    lines = f.readlines()

If you don’t care about closing the file, this one-liner works:

lines = open('file.txt').readlines()

The traditional way:

f = open('file.txt') # Open file on read mode
lines = f.read().split("\n") # Create a list containing all lines
f.close() # Close file

回答 6

如建议的那样，您可以简单地执行以下操作：

with open('/your/path/file') as f:
    my_lines = f.readlines()

请注意，此方法有两个缺点：

1）您将所有行存储在内存中。在一般情况下，这是一个非常糟糕的主意。该文件可能非常大，并且可能会用完内存。即使它不大，也只是浪费内存。

2）不允许在阅读每行时对其进行处理。因此，如果您在此之后处理行，则效率不高（需要两次通过而不是一次）。

对于一般情况，更好的方法是：

with open('/your/path/file') as f:
    for line in f:
        process(line)

在任何需要的地方定义过程功能。例如：

def process(line):
    if 'save the world' in line.lower():
         superman.save_the_world()

（Superman该类的实现留给您练习）。

这对于任何文件大小都可以很好地工作，而且您只需一遍就可以浏览文件。这通常是通用解析器的工作方式。

You could simply do the following, as has been suggested:

with open('/your/path/file') as f:
    my_lines = f.readlines()

Note that this approach has 2 downsides:

1) You store all the lines in memory. In the general case, this is a very bad idea. The file could be very large, and you could run out of memory. Even if it’s not large, it is simply a waste of memory.

2) This does not allow processing of each line as you read them. So if you process your lines after this, it is not efficient (requires two passes rather than one).

A better approach for the general case would be the following:

with open('/your/path/file') as f:
    for line in f:
        process(line)

Where you define your process function any way you want. For example:

def process(line):
    if 'save the world' in line.lower():
         superman.save_the_world()

(The implementation of the Superman class is left as an exercise for you).

This will work nicely for any file size and you go through your file in just 1 pass. This is typically how generic parsers will work.

回答 7

数据入列表

假设我们有一个文本文件，其数据如下行所示，

文字档内容：

line 1
line 2
line 3

在同一目录中打开cmd（右键单击鼠标，然后选择cmd或PowerShell）
运行python并在解释器中编写：

Python脚本：

>>> with open("myfile.txt", encoding="utf-8") as file:
...     x = [l.strip() for l in file]
>>> x
['line 1','line 2','line 3']

使用追加：

x = []
with open("myfile.txt") as file:
    for l in file:
        x.append(l.strip())

要么：

>>> x = open("myfile.txt").read().splitlines()
>>> x
['line 1', 'line 2', 'line 3']

要么：

>>> x = open("myfile.txt").readlines()
>>> x
['linea 1\n', 'line 2\n', 'line 3\n']

要么：

>>> y = [x.rstrip() for x in open("my_file.txt")]
>>> y
['line 1','line 2','line 3']


with open('testodiprova.txt', 'r', encoding='utf-8') as file:
    file = file.read().splitlines()
  print(file)

with open('testodiprova.txt', 'r', encoding='utf-8') as file:
  file = file.readlines()
  print(file)

Data into list

Assume that we have a text file with our data like in the following lines,

Text file content:

line 1
line 2
line 3

Open the cmd in the same directory (right-click the mouse and choose cmd or PowerShell)
Run python and in the interpreter write:

The Python script:

>>> with open("myfile.txt", encoding="utf-8") as file:
...     x = [l.strip() for l in file]
>>> x
['line 1','line 2','line 3']

Using append:

x = []
with open("myfile.txt") as file:
    for l in file:
        x.append(l.strip())

Or:

>>> x = open("myfile.txt").read().splitlines()
>>> x
['line 1', 'line 2', 'line 3']

Or:

>>> x = open("myfile.txt").readlines()
>>> x
['linea 1\n', 'line 2\n', 'line 3\n']

Or:

>>> y = [x.rstrip() for x in open("my_file.txt")]
>>> y
['line 1','line 2','line 3']


with open('testodiprova.txt', 'r', encoding='utf-8') as file:
    file = file.read().splitlines()
  print(file)

with open('testodiprova.txt', 'r', encoding='utf-8') as file:
  file = file.readlines()
  print(file)

回答 8

要将文件读入列表，您需要做三件事：

开启档案
读取文件
将内容存储为列表

幸运的是，Python使执行这些操作变得非常容易，因此将文件读入列表的最短方法是：

lst = list(open(filename))

但是，我将添加更多解释。

打开文件

我假设您要打开特定文件，并且不直接处理文件句柄（或类似文件的句柄）。在Python中打开文件最常用的功能是open，它在Python 2.7中带有一个强制参数和两个可选参数：

文件名
模式
缓冲（我将在此答案中忽略此参数）

文件名应该是代表文件路径的字符串。例如：

open('afile')   # opens the file named afile in the current working directory
open('adir/afile')            # relative path (relative to the current working directory)
open('C:/users/aname/afile')  # absolute path (windows)
open('/usr/local/afile')      # absolute path (linux)

请注意，需要指定文件扩展名。这对于Windows用户尤其重要，因为在资源管理器中查看时，默认情况下会隐藏文件扩展名（例如.txt或.doc等）。

第二个参数是mode，r默认情况下表示“只读”。这正是您所需要的。

但是，如果您确实要创建文件和/或写入文件，则在此处需要使用其他参数。如果您需要概述，这是一个很好的答案。

要读取文件，您可以省略mode或明确传递它：

open(filename)
open(filename, 'r')

两者都将以只读模式打开文件。如果要在Windows上读取二进制文件，则需要使用模式rb：

open(filename, 'rb')

在其他平台上，'b'（二进制模式）将被忽略。

现在，我已经显示了如何处理open文件，让我们谈谈您总是需要close再次使用它的事实。否则，它将保持对文件的打开文件句柄，直到进程退出（或Python丢弃文件句柄）。

虽然您可以使用：

f = open(filename)
# ... do stuff with f
f.close()

当两者之间存在open并close引发异常时，将无法关闭文件。您可以使用try和来避免这种情况finally：

f = open(filename)
# nothing in between!
try:
    # do stuff with f
finally:
    f.close()

但是，Python提供了具有更漂亮语法的上下文管理器（但与上面open的try和几乎相同finally）：

with open(filename) as f:
    # do stuff with f
# The file is always closed after the with-scope ends.

最后一种方法是建议使用 Python打开文件的方法！

读取文件

好的，您已经打开了文件，现在如何读取？

该open函数返回一个file对象，它支持Python的迭代协议。每次迭代都会给你一行：

with open(filename) as f:
    for line in f:
        print(line)

这将打印文件的每一行。但是请注意，每行\n的末尾都将包含一个换行符（您可能要检查您的Python是否具有通用换行符支持 -否则\r\n在Windows或\rMac 上也可以作为换行符）。如果您不希望这样做，可以简单地删除最后符（或Windows中的最后两个字符）：

with open(filename) as f:
    for line in f:
        print(line[:-1])

但是最后一行不一定有尾随换行符，因此不应使用它。可以检查它是否以尾随换行符结尾，如果是这样，请将其删除：

with open(filename) as f:
    for line in f:
        if line.endswith('\n'):
            line = line[:-1]
        print(line)

但是您可以简单地\n从字符串末尾删除所有空格（包括字符），这还将删除所有其他尾随空格，因此如果这些空格很重要，则必须小心：

with open(filename) as f:
    for line in f:
        print(f.rstrip())

但是，如果这些行以\r\n（Windows“ newlines”）结尾，.rstrip()则也将注意\r！

将内容存储为列表

现在您知道了如何打开文件并阅读它，是时候将内容存储在列表中了。最简单的选择是使用以下list功能：

with open(filename) as f:
    lst = list(f)

如果要删除尾随的换行符，可以使用列表理解：

with open(filename) as f:
    lst = [line.rstrip() for line in f]

或更简单：默认情况下.readlines()，file对象的方法返回list以下行中的a：

with open(filename) as f:
    lst = f.readlines()

这还将包括尾随换行符，如果您不希望它们，我将推荐这种[line.rstrip() for line in f]方法，因为它避免了在内存中保留包含所有行的两个列表。

还有一个额外的选项来获得所需的输出，但是它是“次优的”：read将整个文件放在字符串中，然后在换行符上分割：

with open(filename) as f:
    lst = f.read().split('\n')

要么：

with open(filename) as f:
    lst = f.read().splitlines()

由于split不包含字符，因此它们会自动处理尾随的换行符。但是，它们并不理想，因为您将文件保留为字符串和内存中的行列表！

摘要

with open(...) as f在打开文件时使用，因为您无需自己关闭文件，即使发生某些异常也可以关闭文件。
file对象支持迭代协议，因此逐行读取文件就像一样简单for line in the_file_object:。
始终浏览文档以获取可用的功能/类。在大多数情况下，任务或至少一个或两个好的任务是一个完美的选择。在这种情况下，显而易见的选择是，readlines()但是如果您要在将行存储到列表中之前对其进行处理，我建议您使用简单的列表理解。

To read a file into a list you need to do three things:

Open the file
Read the file
Store the contents as list

Fortunately Python makes it very easy to do these things so the shortest way to read a file into a list is:

lst = list(open(filename))

However I’ll add some more explanation.

Opening the file

I assume that you want to open a specific file and you don’t deal directly with a file-handle (or a file-like-handle). The most commonly used function to open a file in Python is open, it takes one mandatory argument and two optional ones in Python 2.7:

Filename
Mode
Buffering (I’ll ignore this argument in this answer)

The filename should be a string that represents the path to the file. For example:

open('afile')   # opens the file named afile in the current working directory
open('adir/afile')            # relative path (relative to the current working directory)
open('C:/users/aname/afile')  # absolute path (windows)
open('/usr/local/afile')      # absolute path (linux)

Note that the file extension needs to be specified. This is especially important for Windows users because file extensions like .txt or .doc, etc. are hidden by default when viewed in the explorer.

The second argument is the mode, it’s r by default which means “read-only”. That’s exactly what you need in your case.

But in case you actually want to create a file and/or write to a file you’ll need a different argument here. There is an excellent answer if you want an overview.

For reading a file you can omit the mode or pass it in explicitly:

open(filename)
open(filename, 'r')

Both will open the file in read-only mode. In case you want to read in a binary file on Windows you need to use the mode rb:

open(filename, 'rb')

On other platforms the 'b' (binary mode) is simply ignored.

Now that I’ve shown how to open the file, let’s talk about the fact that you always need to close it again. Otherwise it will keep an open file-handle to the file until the process exits (or Python garbages the file-handle).

While you could use:

f = open(filename)
# ... do stuff with f
f.close()

That will fail to close the file when something between open and close throws an exception. You could avoid that by using a try and finally:

f = open(filename)
# nothing in between!
try:
    # do stuff with f
finally:
    f.close()

However Python provides context managers that have a prettier syntax (but for open it’s almost identical to the try and finally above):

with open(filename) as f:
    # do stuff with f
# The file is always closed after the with-scope ends.

The last approach is the recommended approach to open a file in Python!

Reading the file

Okay, you’ve opened the file, now how to read it?

The open function returns a file object and it supports Pythons iteration protocol. Each iteration will give you a line:

with open(filename) as f:
    for line in f:
        print(line)

This will print each line of the file. Note however that each line will contain a newline character \n at the end (you might want to check if your Python is built with universal newlines support – otherwise you could also have \r\n on Windows or \r on Mac as newlines). If you don’t want that you can could simply remove the last character (or the last two characters on Windows):

with open(filename) as f:
    for line in f:
        print(line[:-1])

But the last line doesn’t necessarily has a trailing newline, so one shouldn’t use that. One could check if it ends with a trailing newline and if so remove it:

with open(filename) as f:
    for line in f:
        if line.endswith('\n'):
            line = line[:-1]
        print(line)

But you could simply remove all whitespaces (including the \n character) from the end of the string, this will also remove all other trailing whitespaces so you have to be careful if these are important:

with open(filename) as f:
    for line in f:
        print(f.rstrip())

However if the lines end with \r\n (Windows “newlines”) that .rstrip() will also take care of the \r!

Store the contents as list

Now that you know how to open the file and read it, it’s time to store the contents in a list. The simplest option would be to use the list function:

with open(filename) as f:
    lst = list(f)

In case you want to strip the trailing newlines you could use a list comprehension instead:

with open(filename) as f:
    lst = [line.rstrip() for line in f]

Or even simpler: The .readlines() method of the file object by default returns a list of the lines:

with open(filename) as f:
    lst = f.readlines()

This will also include the trailing newline characters, if you don’t want them I would recommend the [line.rstrip() for line in f] approach because it avoids keeping two lists containing all the lines in memory.

There’s an additional option to get the desired output, however it’s rather “suboptimal”: read the complete file in a string and then split on newlines:

with open(filename) as f:
    lst = f.read().split('\n')

or:

with open(filename) as f:
    lst = f.read().splitlines()

These take care of the trailing newlines automatically because the split character isn’t included. However they are not ideal because you keep the file as string and as a list of lines in memory!

Summary

Use with open(...) as f when opening files because you don’t need to take care of closing the file yourself and it closes the file even if some exception happens.
file objects support the iteration protocol so reading a file line-by-line is as simple as for line in the_file_object:.
Always browse the documentation for the available functions/classes. Most of the time there’s a perfect match for the task or at least one or two good ones. The obvious choice in this case would be readlines() but if you want to process the lines before storing them in the list I would recommend a simple list-comprehension.

回答 9

将文件中的行读入列表的简洁Python方式

首先，最重要的是，您应该专注于以高效且Python方式打开文件并读取其内容。这是我个人不喜欢的方式的一个示例：

infile = open('my_file.txt', 'r')  # Open the file for reading.

data = infile.read()  # Read the contents of the file.

infile.close()  # Close the file since we're done using it.

相反，我更喜欢以下打开文件进行读写的方法，因为它非常干净，并且在使用完文件后不需要关闭文件的额外步骤。在下面的语句中，我们将打开文件进行读取，并将其分配给变量“ infile”。一旦该语句中的代码运行完毕，该文件将自动关闭。

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

现在，我们需要集中精力将这些数据引入Python列表中，因为它们是可迭代的，高效的和灵活的。在您的情况下，理想的目标是将文本文件的每一行放入一个单独的元素中。为此，我们将使用splitlines（）方法，如下所示：

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

最终产品：

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

测试我们的代码：

文本文件的内容：

     A fost odatã ca-n povesti,
     A fost ca niciodatã,
     Din rude mãri împãrãtesti,
     O prea frumoasã fatã.

打印报表以进行测试：

    print my_list  # Print the list.

    # Print each line in the list.
    for line in my_list:
        print line

    # Print the fourth element in this list.
    print my_list[3]

输出（由于Unicode字符而外观不同）：

     ['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
     'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
     frumoas\xc3\xa3 fat\xc3\xa3.']

     A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
     împãrãtesti, O prea frumoasã fatã.

     O prea frumoasã fatã.

Clean and Pythonic Way of Reading the Lines of a File Into a List

First and foremost, you should focus on opening your file and reading its contents in an efficient and pythonic way. Here is an example of the way I personally DO NOT prefer:

infile = open('my_file.txt', 'r')  # Open the file for reading.

data = infile.read()  # Read the contents of the file.

infile.close()  # Close the file since we're done using it.

Instead, I prefer the below method of opening files for both reading and writing as it is very clean, and does not require an extra step of closing the file once you are done using it. In the statement below, we’re opening the file for reading, and assigning it to the variable ‘infile.’ Once the code within this statement has finished running, the file will be automatically closed.

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

Now we need to focus on bringing this data into a Python List because they are iterable, efficient, and flexible. In your case, the desired goal is to bring each line of the text file into a separate element. To accomplish this, we will use the splitlines() method as follows:

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

The Final Product:

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

Testing Our Code:

Contents of the text file:

     A fost odatã ca-n povesti,
     A fost ca niciodatã,
     Din rude mãri împãrãtesti,
     O prea frumoasã fatã.

Print statements for testing purposes:

    print my_list  # Print the list.

    # Print each line in the list.
    for line in my_list:
        print line

    # Print the fourth element in this list.
    print my_list[3]

Output (different-looking because of unicode characters):

     ['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
     'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
     frumoas\xc3\xa3 fat\xc3\xa3.']

     A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
     împãrãtesti, O prea frumoasã fatã.

     O prea frumoasã fatã.

回答 10

在Python 3.4中引入，它pathlib具有从文件中读取文本的非常方便的方法，如下所示：

from pathlib import Path
p = Path('my_text_file')
lines = p.read_text().splitlines()

（该splitlines调用使它从包含文件全部内容的字符串变成文件中的行列表）。

pathlib有很多方便的地方。read_text简洁明了，您不必担心打开和关闭文件的麻烦。如果您需要一次性处理所有文件，那么这是一个不错的选择。

Introduced in Python 3.4, pathlib has a really convenient method for reading in text from files, as follows:

from pathlib import Path
p = Path('my_text_file')
lines = p.read_text().splitlines()

(The splitlines call is what turns it from a string containing the whole contents of the file to a list of lines in the file).

pathlib has a lot of handy conveniences in it. read_text is nice and concise, and you don’t have to worry about opening and closing the file. If all you need to do with the file is read it all in in one go, it’s a good choice.

回答 11

通过对文件使用列表推导，这是另一个选择。

lines = [line.rstrip() for line in open('file.txt')]

这应该是一种更有效的方法，因为大部分工作都在Python解释器中完成。

Here’s one more option by using list comprehensions on files;

lines = [line.rstrip() for line in open('file.txt')]

This should be more efficient way as the most of the work is done inside the Python interpreter.

回答 12

f = open("your_file.txt",'r')
out = f.readlines() # will append in the list out

现在，变量out是您想要的列表（数组）。您可以这样做：

for line in out:
    print (line)

要么：

for line in f:
    print (line)

您将获得相同的结果。

f = open("your_file.txt",'r')
out = f.readlines() # will append in the list out

Now variable out is a list (array) of what you want. You could either do:

for line in out:
    print (line)

Or:

for line in f:
    print (line)

You’ll get the same results.

回答 13

使用Python 2和Python 3读写文本文件；它适用于Unicode

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Define data
lines = ['     A first string  ',
         'A Unicode sample: €',
         'German: äöüß']

# Write text file
with open('file.txt', 'w') as fp:
    fp.write('\n'.join(lines))

# Read text file
with open('file.txt', 'r') as fp:
    read_lines = fp.readlines()
    read_lines = [line.rstrip('\n') for line in read_lines]

print(lines == read_lines)

注意事项：

with是所谓的上下文管理器。确保打开的文件再次关闭。
这里所有产生.strip()或.rstrip()将无法复制的解决方案都将lines剥夺空白。

通用文件结尾

.txt

更高级的文件写入/读取

CSV：超简单格式（读写）
JSON：非常适合编写人类可读的数据；非常常用（读和写）
YAML：YAML是JSON的超集，但是更易于阅读（读写，JSON和YAML的比较）
pickle：Python序列化格式（读写）
MessagePack（Python软件包）：更紧凑的表示形式（读和写）
HDF5（Python程序包）：适用于矩阵（读写）
XML：存在太多*叹息*（读与写）

对于您的应用程序，以下内容可能很重要：

其他编程语言的支持
读写性能
紧凑度（文件大小）

另请参阅：数据序列化格式的比较

如果您想寻找一种制作配置文件的方法，则可能需要阅读我的简短文章《Python中的配置文件》。

Read and write text files with Python 2 and Python 3; it works with Unicode

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Define data
lines = ['     A first string  ',
         'A Unicode sample: €',
         'German: äöüß']

# Write text file
with open('file.txt', 'w') as fp:
    fp.write('\n'.join(lines))

# Read text file
with open('file.txt', 'r') as fp:
    read_lines = fp.readlines()
    read_lines = [line.rstrip('\n') for line in read_lines]

print(lines == read_lines)

Things to notice:

with is a so-called context manager. It makes sure that the opened file is closed again.
All solutions here which simply make .strip() or .rstrip() will fail to reproduce the lines as they also strip the white space.

Common file endings

.txt

More advanced file writing/reading

CSV: Super simple format (read & write)
JSON: Nice for writing human-readable data; VERY commonly used (read & write)
YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
pickle: A Python serialization format (read & write)
MessagePack (Python package): More compact representation (read & write)
HDF5 (Python package): Nice for matrices (read & write)
XML: exists too *sigh* (read & write)

For your application, the following might be important:

Support by other programming languages
Reading/writing performance
Compactness (file size)

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python.

回答 14

另一个选项是numpy.genfromtxt，例如：

import numpy as np
data = np.genfromtxt("yourfile.dat",delimiter="\n")

这将使dataNumPy数组具有与文件中一样多的行。

Another option is numpy.genfromtxt, for example:

import numpy as np
data = np.genfromtxt("yourfile.dat",delimiter="\n")

This will make data a NumPy array with as many rows as are in your file.

回答 15

如果您想从命令行或标准输入中读取文件，也可以使用以下fileinput模块：

# reader.py
import fileinput

content = []
for line in fileinput.input():
    content.append(line.strip())

fileinput.close()

像这样将文件传递给它：

$ python reader.py textfile.txt

在此处阅读更多信息：http : //docs.python.org/2/library/fileinput.html

If you’d like to read a file from the command line or from stdin, you can also use the fileinput module:

# reader.py
import fileinput

content = []
for line in fileinput.input():
    content.append(line.strip())

fileinput.close()

Pass files to it like so:

$ python reader.py textfile.txt

回答 16

最简单的方法

一种简单的方法是：

以字符串形式读取整个文件
逐行拆分字符串

在一行中，这将给出：

lines = open('C:/path/file.txt').read().splitlines()

但是，这是一种非常低效的方式，因为它将在内存中存储2个版本的内容（对于小文件来说可能不是一个大问题，但仍然如此）。[谢谢马克·阿默里]。

有2种更简单的方法：

使用文件作为迭代器

lines = list(open('C:/path/file.txt'))
# ... or if you want to have a list without EOL characters
lines = [l.rstrip() for l in open('C:/path/file.txt')]

如果您使用的是Python 3.4或更高版本，请更好地pathlib为文件创建路径，以供程序中的其他操作使用：

from pathlib import Path
file_path = Path("C:/path/file.txt") 
lines = file_path.read_text().split_lines()
# ... or ... 
lines = [l.rstrip() for l in file_path.open()]

The simplest way to do it

A simple way is to:

Read the whole file as a string
Split the string line by line

In one line, that would give:

lines = open('C:/path/file.txt').read().splitlines()

However, this is quite inefficient way as this will store 2 versions of the content in memory (probably not a big issue for small files, but still). [Thanks Mark Amery].

There are 2 easier ways:

Using the file as an iterator

lines = list(open('C:/path/file.txt'))
# ... or if you want to have a list without EOL characters
lines = [l.rstrip() for l in open('C:/path/file.txt')]

If you are using Python 3.4 or above, better use pathlib to create a path for your file that you could use for other operations in your program:

from pathlib import Path
file_path = Path("C:/path/file.txt") 
lines = file_path.read_text().split_lines()
# ... or ... 
lines = [l.rstrip() for l in file_path.open()]

回答 17

只需使用splitlines（）函数。这是一个例子。

inp = "file.txt"
data = open(inp)
dat = data.read()
lst = dat.splitlines()
print lst
# print(lst) # for python 3

在输出中，您将具有行列表。

Just use the splitlines() functions. Here is an example.

inp = "file.txt"
data = open(inp)
dat = data.read()
lst = dat.splitlines()
print lst
# print(lst) # for python 3

In the output you will have the list of lines.

回答 18

如果您想要面对一个非常大的文件，并且想要更快地读取（假设您正在参加Topcoder / Hackerrank编码竞赛），则可以一次将相当大的几行读取到内存缓冲区中，而不是一次只是在文件级别逐行迭代。

buffersize = 2**16
with open(path) as f: 
    while True:
        lines_buffer = f.readlines(buffersize)
        if not lines_buffer:
            break
        for line in lines_buffer:
            process(line)

If you want to are faced with a very large / huge file and want to read faster (imagine you are in a Topcoder/Hackerrank coding competition), you might read a considerably bigger chunk of lines into a memory buffer at one time, rather than just iterate line by line at file level.

buffersize = 2**16
with open(path) as f: 
    while True:
        lines_buffer = f.readlines(buffersize)
        if not lines_buffer:
            break
        for line in lines_buffer:
            process(line)

回答 19

实现此目标的最简单方法是：

lines = list(open('filename'))

要么

lines = tuple(open('filename'))

要么

lines = set(open('filename'))

在使用的情况下set，必须记住，我们没有保留行顺序并摆脱了重复的行。

我在下面添加了@MarkAmery的重要补充：

由于您既不调用.close文件对象也不使用with语句，因此在某些Python实现中，文件在读取后可能不会关闭，并且您的进程将泄漏打开的文件句柄。

在CPython（大多数人使用的普通Python实现）中，这不是问题，因为文件对象将立即被垃圾收集并关闭文件，但是，尽管如此，它仍被认为是最佳实践，例如：

with open('filename') as f: lines = list(f)

以确保无论使用哪种Python实现，文件都将关闭。

The easiest ways to do that with some additional benefits are:

lines = list(open('filename'))

lines = tuple(open('filename'))

lines = set(open('filename'))

In the case with set, we must be remembered that we don’t have the line order preserved and get rid of the duplicated lines.

Below I added an important supplement from @MarkAmery:

Since you’re not calling .close on the file object nor using a with statement, in some Python implementations the file may not get closed after reading and your process will leak an open file handle.

In CPython (the normal Python implementation that most people use), this isn’t a problem since the file object will get immediately garbage-collected and this will close the file, but it’s nonetheless generally considered best practice to do something like:

with open('filename') as f: lines = list(f)

to ensure that the file gets closed regardless of what Python implementation you’re using.

回答 20

用这个：

import pandas as pd
data = pd.read_csv(filename) # You can also add parameters such as header, sep, etc.
array = data.values

data是数据框类型，并使用值获取ndarray。您也可以使用来获得列表array.tolist()。

Use this:

import pandas as pd
data = pd.read_csv(filename) # You can also add parameters such as header, sep, etc.
array = data.values

data is a dataframe type, and uses values to get ndarray. You can also get a list by using array.tolist().

回答 21

概述和总结

使用filename，从Path(filename)对象处理文件，或直接使用open(filename) as f，执行以下任一操作：

list(fileinput.input(filename))
使用with path.open() as f，呼叫f.readlines()
list(f)
path.read_text().splitlines()
path.read_text().splitlines(keepends=True)
遍历fileinput.input或，f并且list.append每行一次
传递f给绑定list.extend方法
用于f列表理解

我在下面解释了每个的用例。

在Python中，如何逐行读取文件？

这是一个很好的问题。首先，让我们创建一些示例数据：

from pathlib import Path
Path('filename').write_text('foo\nbar\nbaz')

文件对象是惰性的迭代器，因此只需对其进行迭代即可。

filename = 'filename'
with open(filename) as f:
    for line in f:
        line # do something with the line

或者，如果您有多个文件，请使用fileinput.input，另一个懒惰迭代器。仅一个文件：

import fileinput

for line in fileinput.input(filename): 
    line # process the line

或对于多个文件，向其传递文件名列表：

for line in fileinput.input([filename]*2): 
    line # process the line

再次，f并且fileinput.input在两者之上都是返回懒惰迭代器。您只能使用一次迭代器，因此在提供功能代码的同时避免了冗长性，我将fileinput.input(filename)在此处使用适当的简短程度。

在Python中，如何将文件逐行读入列表？

啊，但是出于某种原因您想要在列表中？如果可能，我会避免这种情况。但是，如果您坚持…只需将结果传递fileinput.input(filename)给list：

list(fileinput.input(filename))

另一个直接的答案是打电话 f.readlines，它返回文件的内容（最多可选hint数目的字符，因此您可以通过这种方式将其分解为多个列表）。

您可以通过两种方式获取此文件对象。一种方法是将文件名传递给open内置：

filename = 'filename'

with open(filename) as f:
    f.readlines()

或使用新的Path对象 pathlib模块中（我已经很喜欢它，并将在此处使用）：

from pathlib import Path

path = Path(filename)

with path.open() as f:
    f.readlines()

list 也将使用文件迭代器并返回列表-同样是一个非常直接的方法：

with path.open() as f:
    list(f)

如果您不介意在拆分之前将整个文本作为单个字符串读取到内存中，则可以使用Path对象和splitlines()字符串方法将其作为一个单行进行。默认，splitlines删除换行符：

path.read_text().splitlines()

如果要保留换行符，请传递keepends=True：

path.read_text().splitlines(keepends=True)

我想逐行读取文件并将每行追加到列表的末尾。

鉴于我们已经用几种方法轻松证明了最终结果，所以这有点愚蠢。但是您在创建列表时可能需要过滤或操作这些行，因此让我们对此请求进行幽默处理。

使用list.append可以让您在添加每一行之前对其进行过滤或操作：

line_list = []
for line in fileinput.input(filename):
    line_list.append(line)

line_list

使用list.extend会更直接一些，如果您已有一个列表，则可能会有用：

line_list = []
line_list.extend(fileinput.input(filename))
line_list

或更惯用的是，我们可以改用列表理解，并在需要时在其中进行映射和过滤：

[line for line in fileinput.input(filename)]

甚至更直接地，要闭合圆，只需将其传递到列表即可直接创建新列表，而无需在线操作：

list(fileinput.input(filename))

结论

您已经看到了许多将文件中的行放入列表中的方法，但是我建议您避免将大量数据具体化到列表中，而是尽可能使用Python的惰性迭代来处理数据。

也就是说，首选fileinput.input或with path.open() as f。

Outline and Summary

With a filename, handling the file from a Path(filename) object, or directly with open(filename) as f, do one of the following:

list(fileinput.input(filename))
using with path.open() as f, call f.readlines()
list(f)
path.read_text().splitlines()
path.read_text().splitlines(keepends=True)
iterate over fileinput.input or f and list.append each line one at a time
pass f to a bound list.extend method
use f in a list comprehension

I explain the use-case for each below.

In Python, how do I read a file line-by-line?

This is an excellent question. First, let’s create some sample data:

from pathlib import Path
Path('filename').write_text('foo\nbar\nbaz')

File objects are lazy iterators, so just iterate over it.

filename = 'filename'
with open(filename) as f:
    for line in f:
        line # do something with the line

Alternatively, if you have multiple files, use fileinput.input, another lazy iterator. With just one file:

import fileinput

for line in fileinput.input(filename): 
    line # process the line

or for multiple files, pass it a list of filenames:

for line in fileinput.input([filename]*2): 
    line # process the line

Again, f and fileinput.input above both are/return lazy iterators. You can only use an iterator one time, so to provide functional code while avoiding verbosity I’ll use the slightly more terse fileinput.input(filename) where apropos from here.

In Python, how do I read a file line-by-line into a list?

Ah but you want it in a list for some reason? I’d avoid that if possible. But if you insist… just pass the result of fileinput.input(filename) to list:

list(fileinput.input(filename))

Another direct answer is to call f.readlines, which returns the contents of the file (up to an optional hint number of characters, so you could break this up into multiple lists that way).

You can get to this file object two ways. One way is to pass the filename to the open builtin:

filename = 'filename'

with open(filename) as f:
    f.readlines()

or using the new Path object from the pathlib module (which I have become quite fond of, and will use from here on):

from pathlib import Path

path = Path(filename)

with path.open() as f:
    f.readlines()

list will also consume the file iterator and return a list – a quite direct method as well:

with path.open() as f:
    list(f)

If you don’t mind reading the entire text into memory as a single string before splitting it, you can do this as a one-liner with the Path object and the splitlines() string method. By default, splitlines removes the newlines:

path.read_text().splitlines()

If you want to keep the newlines, pass keepends=True:

path.read_text().splitlines(keepends=True)

I want to read the file line by line and append each line to the end of the list.

Now this is a bit silly to ask for, given that we’ve demonstrated the end result easily with several methods. But you might need to filter or operate on the lines as you make your list, so let’s humor this request.

Using list.append would allow you to filter or operate on each line before you append it:

line_list = []
for line in fileinput.input(filename):
    line_list.append(line)

line_list

Using list.extend would be a bit more direct, and perhaps useful if you have a preexisting list:

line_list = []
line_list.extend(fileinput.input(filename))
line_list

Or more idiomatically, we could instead use a list comprehension, and map and filter inside it if desirable:

[line for line in fileinput.input(filename)]

Or even more directly, to close the circle, just pass it to list to create a new list directly without operating on the lines:

list(fileinput.input(filename))

Conclusion

You’ve seen many ways to get lines from a file into a list, but I’d recommend you avoid materializing large quantities of data into a list and instead use Python’s lazy iteration to process the data if possible.

That is, prefer fileinput.input or with path.open() as f.

回答 22

如果文档中也有空行，我希望阅读内容并将其传递filter以防止空字符串元素

with open(myFile, "r") as f:
    excludeFileContent = list(filter(None, f.read().splitlines()))

In case that there are also empty lines in the document I like to read in the content and pass it through filter to prevent empty string elements

with open(myFile, "r") as f:
    excludeFileContent = list(filter(None, f.read().splitlines()))

回答 23

您也可以在NumPy中使用loadtxt命令。与genfromtxt相比，此方法检查的条件较少，因此可能更快。

import numpy
data = numpy.loadtxt(filename, delimiter="\n")

You could also use the loadtxt command in NumPy. This checks for fewer conditions than genfromtxt, so it may be faster.

import numpy
data = numpy.loadtxt(filename, delimiter="\n")

回答 24

我喜欢使用以下内容。立即阅读线路。

contents = []
for line in open(filepath, 'r').readlines():
    contents.append(line.strip())

或使用列表理解：

contents = [line.strip() for line in open(filepath, 'r').readlines()]

I like to use the following. Reading the lines immediately.

contents = []
for line in open(filepath, 'r').readlines():
    contents.append(line.strip())

Or using list comprehension:

contents = [line.strip() for line in open(filepath, 'r').readlines()]

回答 25

我会尝试以下提到的方法之一。我使用的示例文件的名称为dummy.txt。您可以在此处找到文件。我认为该文件与代码位于同一目录中（您可以更改fpath以包含正确的文件名和文件夹路径。）

在下面提到的两个示例中，所需的列表由给出lst。

1.>第一种方法：

fpath = 'dummy.txt'
with open(fpath, "r") as f: lst = [line.rstrip('\n \t') for line in f]

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

2.>在第二种方法中，可以使用Python标准库中的csv.reader模块：

import csv
fpath = 'dummy.txt'
with open(fpath) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter='   ')
    lst = [row[0] for row in csv_reader] 

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

您可以使用两种方法之一。创建时间lst在两种方法中时间几乎相等。

I would try one of the below mentioned methods. The example file that I use has the name dummy.txt. You can find the file here. I presume, that the file is in the same directory as the code (you can change fpath to include the proper file name and folder path.)

In both the below mentioned examples, the list that you want is given by lst.

1.> First method:

fpath = 'dummy.txt'
with open(fpath, "r") as f: lst = [line.rstrip('\n \t') for line in f]

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

2.> In the second method, one can use csv.reader module from Python Standard Library:

import csv
fpath = 'dummy.txt'
with open(fpath) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter='   ')
    lst = [row[0] for row in csv_reader] 

print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']

You can use either of the two methods. Time taken for the creation of lst is almost equal in the two methods.

回答 26

这是我用来简化文件I / O 的Python（3）帮助程序库类：

import os

# handle files using a callback method, prevents repetition
def _FileIO__file_handler(file_path, mode, callback = lambda f: None):
  f = open(file_path, mode)
  try:
    return callback(f)
  except Exception as e:
    raise IOError("Failed to %s file" % ["write to", "read from"][mode.lower() in "r rb r+".split(" ")])
  finally:
    f.close()


class FileIO:
  # return the contents of a file
  def read(file_path, mode = "r"):
    return __file_handler(file_path, mode, lambda rf: rf.read())

  # get the lines of a file
  def lines(file_path, mode = "r", filter_fn = lambda line: len(line) > 0):
    return [line for line in FileIO.read(file_path, mode).strip().split("\n") if filter_fn(line)]

  # create or update a file (NOTE: can also be used to replace a file's original content)
  def write(file_path, new_content, mode = "w"):
    return __file_handler(file_path, mode, lambda wf: wf.write(new_content))

  # delete a file (if it exists)
  def delete(file_path):
    return os.remove() if os.path.isfile(file_path) else None

然后FileIO.lines，您将使用该函数，如下所示：

file_ext_lines = FileIO.lines("./path/to/file.ext"):
for i, line in enumerate(file_ext_lines):
  print("Line {}: {}".format(i + 1, line))

请记住，mode（"r"默认情况下）和filter_fn（默认情况下检查空行）参数是可选的。

你甚至可以删除read，write以及delete方法和刚离开FileIO.lines，甚至把它变成所谓的一个单独的方法read_lines。

Here is a Python(3) helper ~~library~~ class that I use to simplify file I/O:

import os

# handle files using a callback method, prevents repetition
def _FileIO__file_handler(file_path, mode, callback = lambda f: None):
  f = open(file_path, mode)
  try:
    return callback(f)
  except Exception as e:
    raise IOError("Failed to %s file" % ["write to", "read from"][mode.lower() in "r rb r+".split(" ")])
  finally:
    f.close()


class FileIO:
  # return the contents of a file
  def read(file_path, mode = "r"):
    return __file_handler(file_path, mode, lambda rf: rf.read())

  # get the lines of a file
  def lines(file_path, mode = "r", filter_fn = lambda line: len(line) > 0):
    return [line for line in FileIO.read(file_path, mode).strip().split("\n") if filter_fn(line)]

  # create or update a file (NOTE: can also be used to replace a file's original content)
  def write(file_path, new_content, mode = "w"):
    return __file_handler(file_path, mode, lambda wf: wf.write(new_content))

  # delete a file (if it exists)
  def delete(file_path):
    return os.remove() if os.path.isfile(file_path) else None

You would then use the FileIO.lines function, like this:

file_ext_lines = FileIO.lines("./path/to/file.ext"):
for i, line in enumerate(file_ext_lines):
  print("Line {}: {}".format(i + 1, line))

Remember that the mode ("r" by default) and filter_fn (checks for empty lines by default) parameters are optional.

You could even remove the read, write and delete methods and just leave the FileIO.lines, or even turn it into a separate method called read_lines.

回答 27

命令行版本

#!/bin/python3
import os
import sys
abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
filename = dname + sys.argv[1]
arr = open(filename).read().split("\n") 
print(arr)

运行：

python3 somefile.py input_file_name.txt

Command line version

#!/bin/python3
import os
import sys
abspath = os.path.abspath(__file__)
dname = os.path.dirname(abspath)
filename = dname + sys.argv[1]
arr = open(filename).read().split("\n") 
print(arr)

Run with:

python3 somefile.py input_file_name.txt

知识问答

如何在Python中附加文件？

2021年7月24日 Python实用宝典

问题：如何在Python中附加文件？

您如何附加到文件而不是覆盖文件？有附加到文件的特殊功能吗？

How do you append to the file instead of overwriting it? Is there a special function that appends to the file?

回答 0

with open("test.txt", "a") as myfile:
    myfile.write("appended text")

with open("test.txt", "a") as myfile:
    myfile.write("appended text")

回答 1

您需要通过将“ a”或“ ab”设置为附加模式以附加模式打开文件。参见open（）。

当您以“ a”模式打开时，写入位置将始终位于文件的末尾（附加）。您可以使用“ a +”打开以允许读取，向后搜索和读取（但所有写入仍将在文件末尾！）。

例：

>>> with open('test1','wb') as f:
        f.write('test')
>>> with open('test1','ab') as f:
        f.write('koko')
>>> with open('test1','rb') as f:
        f.read()
'testkoko'

注意：使用’a’与以’w’打开并搜索到文件末尾不一样-考虑如果另一个程序打开文件并开始在搜索和写入之间进行写操作，会发生什么情况。在某些操作系统上，使用’a’打开文件可确保将所有后续写入原子地附加到文件末尾（即使文件随着其他写入的增长而增加）。

有关“ a”模式如何运行的更多详细信息（仅在Linux上测试过）。即使您回头，每次写操作也会追加到文件末尾：

>>> f = open('test','a+') # Not using 'with' just to simplify the example REPL session
>>> f.write('hi')
>>> f.seek(0)
>>> f.read()
'hi'
>>> f.seek(0)
>>> f.write('bye') # Will still append despite the seek(0)!
>>> f.seek(0)
>>> f.read()
'hibye'

实际上，该手册fopen 页指出：

以追加模式（模式的第一个字符）打开文件会导致对该流的所有后续写入操作在文件末尾发生，就像在调用之前一样：
fseek(stream, 0, SEEK_END);

旧的简化答案（不使用`with`）：

示例：（在实际程序中用于with关闭文件 -请参阅文档）

>>> open("test","wb").write("test")
>>> open("test","a+b").write("koko")
>>> open("test","rb").read()
'testkoko'

You need to open the file in append mode, by setting “a” or “ab” as the mode. See open().

When you open with “a” mode, the write position will always be at the end of the file (an append). You can open with “a+” to allow reading, seek backwards and read (but all writes will still be at the end of the file!).

Example:

>>> with open('test1','wb') as f:
        f.write('test')
>>> with open('test1','ab') as f:
        f.write('koko')
>>> with open('test1','rb') as f:
        f.read()
'testkoko'

Note: Using ‘a’ is not the same as opening with ‘w’ and seeking to the end of the file – consider what might happen if another program opened the file and started writing between the seek and the write. On some operating systems, opening the file with ‘a’ guarantees that all your following writes will be appended atomically to the end of the file (even as the file grows by other writes).

A few more details about how the “a” mode operates (tested on Linux only). Even if you seek back, every write will append to the end of the file:

>>> f = open('test','a+') # Not using 'with' just to simplify the example REPL session
>>> f.write('hi')
>>> f.seek(0)
>>> f.read()
'hi'
>>> f.seek(0)
>>> f.write('bye') # Will still append despite the seek(0)!
>>> f.seek(0)
>>> f.read()
'hibye'

In fact, the fopen manpage states:

Opening a file in append mode (a as the first character of mode) causes all subsequent write operations to this stream to occur at end-of-file, as if preceded the call:
fseek(stream, 0, SEEK_END);

Old simplified answer (not using `with`):

Example: (in a real program use with to close the file – see the documentation)

>>> open("test","wb").write("test")
>>> open("test","a+b").write("koko")
>>> open("test","rb").read()
'testkoko'

回答 2

我总是这样做

f = open('filename.txt', 'a')
f.write("stuff")
f.close()

这很简单，但是非常有用。

I always do this,

f = open('filename.txt', 'a')
f.write("stuff")
f.close()

It’s simple, but very useful.

回答 3

您可能希望将其"a"作为mode参数传递。请参阅文档open（）。

with open("foo", "a") as f:
    f.write("cool beans...")

模式参数还有其他排列方式，用于更新（+），截断（w）和二进制（b）模式，但是从公正开始"a"才是最好的选择。

You probably want to pass "a" as the mode argument. See the docs for open().

with open("foo", "a") as f:
    f.write("cool beans...")

There are other permutations of the mode argument for updating (+), truncating (w) and binary (b) mode but starting with just "a" is your best bet.

回答 4

Python在主要的三种模式之外有许多变体，这三种模式是：

'w'   write text
'r'   read text
'a'   append text

因此，将其附加到文件就像：

f = open('filename.txt', 'a') 
f.write('whatever you want to write here (in append mode) here.')

还有一些模式可以使您的代码减少行数：

'r+'  read + write text
'w+'  read + write text
'a+'  append + read text

最后，还有二进制格式的读/写模式：

'rb'  read binary
'wb'  write binary
'ab'  append binary
'rb+' read + write binary
'wb+' read + write binary
'ab+' append + read binary

Python has many variations off of the main three modes, these three modes are:

'w'   write text
'r'   read text
'a'   append text

So to append to a file it’s as easy as:

f = open('filename.txt', 'a') 
f.write('whatever you want to write here (in append mode) here.')

Then there are the modes that just make your code fewer lines:

'r+'  read + write text
'w+'  read + write text
'a+'  append + read text

Finally, there are the modes of reading/writing in binary format:

'rb'  read binary
'wb'  write binary
'ab'  append binary
'rb+' read + write binary
'wb+' read + write binary
'ab+' append + read binary

回答 5

当我们使用这一行时open(filename, "a")，a表示要追加文件，这意味着允许向现有文件中插入额外的数据。

您可以使用以下几行将文本添加到文件中

def FileSave(filename,content):
    with open(filename, "a") as myfile:
        myfile.write(content)

FileSave("test.txt","test1 \n")
FileSave("test.txt","test2 \n")

when we using this line open(filename, "a"), that a indicates the appending the file, that means allow to insert extra data to the existing file.

You can just use this following lines to append the text in your file

def FileSave(filename,content):
    with open(filename, "a") as myfile:
        myfile.write(content)

FileSave("test.txt","test1 \n")
FileSave("test.txt","test2 \n")

回答 6

您也可以使用print代替write：

with open('test.txt', 'a') as f:
    print('appended text', file=f)

如果test.txt不存在，它将被创建…

You can also do it with print instead of write:

with open('test.txt', 'a') as f:
    print('appended text', file=f)

If test.txt doesn’t exist, it will be created…

回答 7

您也可以在r+模式下打开文件，然后将文件位置设置为文件末尾。

import os

with open('text.txt', 'r+') as f:
    f.seek(0, os.SEEK_END)
    f.write("text to add")

打开文件r+模式将让你写，除了年底其他文件的位置，而a和a+力书写到最后。

You can also open the file in r+ mode and then set the file position to the end of the file.

import os

with open('text.txt', 'r+') as f:
    f.seek(0, os.SEEK_END)
    f.write("text to add")

Opening the file in r+ mode will let you write to other file positions besides the end, while a and a+ force writing to the end.

回答 8

如果要附加到文件

with open("test.txt", "a") as myfile:
    myfile.write("append me")

我们声明了该变量myfile以打开名为的文件test.txt。Open有两个参数，一个是我们要打开的文件，另一个是代表我们要对该文件执行的权限或操作的字符串。

这是文件模式选项

模式说明

'r'这是默认模式。打开文件进行读取。
'w'此模式打开文件进行写入。 
如果文件不存在，它将创建一个新文件。
如果文件存在，它将截断该文件。
'x'创建一个新文件。如果文件已经存在，则操作失败。
'a'以追加模式打开文件。 
如果文件不存在，它将创建一个新文件。
't'这是默认模式。它以文本模式打开。
'b'以二进制模式打开。
'+'这将打开一个文件，用于读写（更新）

if you want to append to a file

with open("test.txt", "a") as myfile:
    myfile.write("append me")

We declared the variable myfile to open a file named test.txt. Open takes 2 arguments, the file that we want to open and a string that represents the kinds of permission or operation we want to do on the file

here is file mode options

Mode    Description

'r' This is the default mode. It Opens file for reading.
'w' This Mode Opens file for writing. 
If file does not exist, it creates a new file.
If file exists it truncates the file.
'x' Creates a new file. If file already exists, the operation fails.
'a' Open file in append mode. 
If file does not exist, it creates a new file.
't' This is the default mode. It opens in text mode.
'b' This opens in binary mode.
'+' This will open a file for reading and writing (updating)

回答 9

该'a'参数表示追加模式。如果您不想with open每次都使用，则可以轻松编写一个函数来帮您：

def append(txt='\nFunction Successfully Executed', file):
    with open(file, 'a') as f:
        f.write(txt)

如果您想写结尾以外的其他地方，可以使用'r+'^†：

import os

with open(file, 'r+') as f:
    f.seek(0, os.SEEK_END)
    f.write("text to add")

最终，该'w+'参数赋予了更大的自由度。具体来说，它允许您创建文件（如果不存在）以及清空当前存在的文件的内容。

^† 此功能的功劳归@Primusa

The 'a' parameter signifies append mode. If you don’t want to use with open each time, you can easily write a function to do it for you:

def append(txt='\nFunction Successfully Executed', file):
    with open(file, 'a') as f:
        f.write(txt)

If you want to write somewhere else other than the end, you can use 'r+'^†:

import os

with open(file, 'r+') as f:
    f.seek(0, os.SEEK_END)
    f.write("text to add")

Finally, the 'w+' parameter grants even more freedom. Specifically, it allows you to create the file if it doesn’t exist, as well as empty the contents of a file that currently exists.

^† Credit for this function goes to @Primusa

回答 10

将更多文本附加到文件末尾的最简单方法是使用：

with open('/path/to/file', 'a+') as file:
    file.write("Additions to file")
file.close()

将a+在open(...)声明中指示打开追加模式的文件，允许读取和写入访问。

使用file.close()完后，关闭所有打开的文件也是一种好习惯。

The simplest way to append more text to the end of a file would be to use:

with open('/path/to/file', 'a+') as file:
    file.write("Additions to file")
file.close()

The a+ in the open(...) statement instructs to open the file in append mode and allows read and write access.

It is also always good practice to use file.close() to close any files that you have opened once you are done using them.

回答 11

这是我的脚本，基本上计算行数，然后追加，然后再对它们进行计数，这样您就可以证明它起作用了。

shortPath  = "../file_to_be_appended"
short = open(shortPath, 'r')

## this counts how many line are originally in the file:
long_path = "../file_to_be_appended_to" 
long = open(long_path, 'r')
for i,l in enumerate(long): 
    pass
print "%s has %i lines initially" %(long_path,i)
long.close()

long = open(long_path, 'a') ## now open long file to append
l = True ## will be a line
c = 0 ## count the number of lines you write
while l: 
    try: 
        l = short.next() ## when you run out of lines, this breaks and the except statement is run
        c += 1
        long.write(l)

    except: 
        l = None
        long.close()
        print "Done!, wrote %s lines" %c 

## finally, count how many lines are left. 
long = open(long_path, 'r')
for i,l in enumerate(long): 
    pass
print "%s has %i lines after appending new lines" %(long_path, i)
long.close()

Here’s my script, which basically counts the number of lines, then appends, then counts them again so you have evidence it worked.

shortPath  = "../file_to_be_appended"
short = open(shortPath, 'r')

## this counts how many line are originally in the file:
long_path = "../file_to_be_appended_to" 
long = open(long_path, 'r')
for i,l in enumerate(long): 
    pass
print "%s has %i lines initially" %(long_path,i)
long.close()

long = open(long_path, 'a') ## now open long file to append
l = True ## will be a line
c = 0 ## count the number of lines you write
while l: 
    try: 
        l = short.next() ## when you run out of lines, this breaks and the except statement is run
        c += 1
        long.write(l)

    except: 
        l = None
        long.close()
        print "Done!, wrote %s lines" %c 

## finally, count how many lines are left. 
long = open(long_path, 'r')
for i,l in enumerate(long): 
    pass
print "%s has %i lines after appending new lines" %(long_path, i)
long.close()

知识问答

如何检查文件是否存在无exceptions？

2021年7月19日 Python实用宝典

问题：如何检查文件是否存在无exceptions？

如何在不使用try语句的情况下检查文件是否存在？

How do I check if a file exists or not, without using the try statement?

回答 0

如果您要检查的原因是可以执行类似的操作if file_exists: open_it()，则使用try尝试来打开它会。检查然后打开可能会导致文件被删除或移动，或者介于检查和尝试打开之间的时间。

如果您不打算立即打开文件，则可以使用 os.path.isfile

True如果path是现有的常规文件，则返回。这遵循符号链接，因此，对于同一路径，islink（）和isfile（）都可以为true。

import os.path
os.path.isfile(fname)

如果您需要确保它是一个文件。

从Python 3.4开始，该pathlib模块提供了一种面向对象的方法（pathlib2在2.7中向后移植）：

from pathlib import Path

my_file = Path("/path/to/file")
if my_file.is_file():
    # file exists

要检查目录，请执行以下操作：

if my_file.is_dir():
    # directory exists

要检查Path对象是否独立于文件还是目录，请使用exists()：

if my_file.exists():
    # path exists

您也可以resolve(strict=True)在一个try块中使用：

try:
    my_abs_path = my_file.resolve(strict=True)
except FileNotFoundError:
    # doesn't exist
else:
    # exists

If the reason you’re checking is so you can do something like if file_exists: open_it(), it’s safer to use a try around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.

If you’re not planning to open the file immediately, you can use os.path.isfile

Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.

import os.path
os.path.isfile(fname)

if you need to be sure it’s a file.

Starting with Python 3.4, the pathlib module offers an object-oriented approach (backported to pathlib2 in Python 2.7):

from pathlib import Path

my_file = Path("/path/to/file")
if my_file.is_file():
    # file exists

To check a directory, do:

if my_file.is_dir():
    # directory exists

To check whether a Path object exists independently of whether is it a file or directory, use exists():

if my_file.exists():
    # path exists

You can also use resolve(strict=True) in a try block:

try:
    my_abs_path = my_file.resolve(strict=True)
except FileNotFoundError:
    # doesn't exist
else:
    # exists

回答 1

您具有以下os.path.exists功能：

import os.path
os.path.exists(file_path)

这会同时返回True文件和目录，但您可以改用

os.path.isfile(file_path)

测试它是否是专门的文件。它遵循符号链接。

You have the os.path.exists function:

import os.path
os.path.exists(file_path)

This returns True for both files and directories but you can instead use

os.path.isfile(file_path)

to test if it’s a file specifically. It follows symlinks.

回答 2

不像isfile()，exists()将返回True目录。因此，根据您只需要纯文件还是目录，您将使用isfile()或exists()。这是一些简单的REPL输出：

>>> os.path.isfile("/etc/password.txt")
True
>>> os.path.isfile("/etc")
False
>>> os.path.isfile("/does/not/exist")
False
>>> os.path.exists("/etc/password.txt")
True
>>> os.path.exists("/etc")
True
>>> os.path.exists("/does/not/exist")
False

Unlike isfile(), exists() will return True for directories. So depending on if you want only plain files or also directories, you’ll use isfile() or exists(). Here is some simple REPL output:

>>> os.path.isfile("/etc/password.txt")
True
>>> os.path.isfile("/etc")
False
>>> os.path.isfile("/does/not/exist")
False
>>> os.path.exists("/etc/password.txt")
True
>>> os.path.exists("/etc")
True
>>> os.path.exists("/does/not/exist")
False

回答 3

import os.path

if os.path.isfile(filepath):

import os.path

if os.path.isfile(filepath):

回答 4

使用os.path.isfile()有os.access()：

import os

PATH = './file.txt'
if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
    print("File exists and is readable")
else:
    print("Either the file is missing or not readable")

Use os.path.isfile() with os.access():

import os

PATH = './file.txt'
if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
    print("File exists and is readable")
else:
    print("Either the file is missing or not readable")

回答 5

import os
os.path.exists(path) # Returns whether the path (directory or file) exists or not
os.path.isfile(path) # Returns whether the file exists or not

import os
os.path.exists(path) # Returns whether the path (directory or file) exists or not
os.path.isfile(path) # Returns whether the file exists or not

回答 6

尽管在（至少一个）现有答案中已经列出了几乎所有可能的方法（例如，添加了Python 3.4特定的内容），但我将尝试将所有内容组合在一起。

注意：我要发布的每个Python标准库代码都属于3.5.3版。

问题陈述：

检查文件（可以参数：也是文件夹（“特殊”文件）吗？）是否存在
不要使用try / except / else / finally块

可能的解决方案：

[Python 3]：os.path。存在（路径）（还要检查其他功能的家庭成员一样os.path.isfile，os.path.isdir，os.path.lexists对行为略有不同）
```
os.path.exists(path)
```
返回True如果路径是指现有的路径或一个打开的文件描述符。返回False断开的符号链接。在某些平台上，即使未物理上存在路径，如果False未授予在请求的文件上执行os.stat（）的权限，此函数也可能返回。

一切都很好，但是如果遵循导入树：
- os.path– posixpath.py（ntpath.py）
  - genericpath.py，第〜＃20 +行
```
def exists(path):
    """Test whether a path exists.  Returns False for broken symbolic links"""
    try:
        st = os.stat(path)
    except os.error:
        return False
    return True
```
它只是[Python 3]周围的try / 除了块：操作系统。stat（path，*，dir_fd = None，follow_symlinks = True）。因此，您的代码是try / 除了免费的，但在帧堆栈中至少有一个这样的块。这也适用于其他功能（包括 os.path.isfile）。

1.1。[Python 3]：路径。is_file（）
- 这是一种处理路径的更好的方式（以及更多的python ic），但是
- 在后台，它做的完全一样（pathlib.py，行〜＃1330）：
```
def is_file(self):
    """
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).
    """
    try:
        return S_ISREG(self.stat().st_mode)
    except OSError as e:
        if e.errno not in (ENOENT, ENOTDIR):
            raise
        # Path doesn't exist or is a broken symlink
        # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
        return False
```

[Python 3]：使用语句上下文管理器。要么：

创建一个：

class Swallow:  # Dummy example
    swallowed_exceptions = (FileNotFoundError,)

    def __enter__(self):
        print("Entering...")

    def __exit__(self, exc_type, exc_value, exc_traceback):
        print("Exiting:", exc_type, exc_value, exc_traceback)
        return exc_type in Swallow.swallowed_exceptions  # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)

而它的用法-我会复制os.path.isfile行为（请注意，这只是为了演示的目的，也不会尝试写这样的代码制作）：

import os
import stat


def isfile_seaman(path):  # Dummy func
    result = False
    with Swallow():
        result = stat.S_ISREG(os.stat(path).st_mode)
    return result

使用[Python 3]：contextlib。抑制（*exceptions） -这是具体地用于选择性地抑制异常设计

但是，它们似乎是try / except / else / finally块的包装，如[Python 3]：with语句指出：

这使得普通试试 …… 除非 …… 终于被封装为方便重复使用的使用模式。

文件系统遍历功能（并在结果中搜索匹配项）
- [Python 3]：操作系统。listdir（path =’。’）（或在Python v 3.5 + 上的[Python 3]：操作系统scandir（path =’。’），向后移植：[PyPI]：scandir）
  - 在引擎盖下，两者都使用：
    - Nix：[man7]：OPENDIR（3） / [man7]：READDIR（3） / [man7]：CLOSEDIR（3）
    - Win：[MS.Docs]：FindFirstFileW函数 / [MS.Docs]：FindNextFileW函数 / [MS.Docs]：FindClose函数
    通过[GitHub]：python / cpython-（master）cpython / Modules / posixmodule.c
  使用scandir（）而不是listdir（）可以显着提高还需要文件类型或文件属性信息的代码的性能，因为如果操作系统在扫描目录时提供了os.DirEntry对象，则该信息会公开。所有的os.DirEntry方法都可以执行系统调用，但是is_dir（）和is_file（）通常只需要系统调用即可进行符号链接。os.DirEntry.stat（）在Unix上始终需要系统调用，而在Windows上只需要一个系统调用即可。
- [Python 3]：操作系统。步行（top，topdown = True，onerror = None，followlinks = False）
  - 它使用os.listdir（os.scandir可用时）
- [Python 3]：glob。iglob（路径名，*，递归=假）（或它的前身：glob.glob）
  - 本身似乎没有遍历功能（至少在某些情况下），但它仍然使用os.listdir
由于这些遍历文件夹，（在大多数情况下）它们对于我们的问题效率不高（有一些exceptions，例如非通配glob bing-如@ShadowRanger所指出的），所以我不再坚持使用它们。更不用说在某些情况下，可能需要处理文件名。
[Python 3]：操作系统。访问（路径，模式，*，dir_fd =无，effective_ids =假follow_symlinks =真）的行为是接近os.path.exists（实际上这是2个，主要是因为更宽，^第二参数）
- 用户权限可能会限制文件“可见性”，如doc所述：
  
  …测试调用用户是否具有对path的指定访问权限。模式应该为F_OK以测试路径的存在…
os.access("/tmp", os.F_OK)

自从我也工作Ç，我用这个方法，以及因为引擎盖下，它调用本地API小号（同样，通过“$ {} PYTHON_SRC_DIR /Modules/posixmodule.c”），但它也开辟了可能的栅极用户errors，它不像其他变体那样像Python ic。因此，正如@AaronHall正确指出的那样，除非您知道自己在做什么，否则不要使用它：
- Nix：[man7]：ACCESS（2）（!!!请注意有关其使用可能会引入的安全漏洞的说明！！！）
- 赢：[MS.Docs]：GetFileAttributesW函数
注意：也可以通过[Python 3]调用本地API ：ctypes -Python的外部函数库，但是在大多数情况下，它更为复杂。

（赢特定于）：由于vcruntime *（msvcr *）. dll导出[MS.Docs]：_access，_waccess函数家族，因此下面是一个示例：
```
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, ctypes
>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe", os.F_OK)
0
>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe.notexist", os.F_OK)
-1
```
注意事项：
- 尽管这不是一个好习惯，但我os.F_OK在通话中使用了，但这只是为了清楚起见（其值为0）
- 我正在使用_waccess，以便相同的代码可在Python3和Python2上使用（尽管它们之间存在与Unicode相关的区别）
- 尽管这是针对非常特定的领域，但之前的任何答案都未提及
的LNX（Ubtu（16 64）以及）对应物：
```
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, ctypes
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
0
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
-1
```
注意事项：
- 而是硬编码libc的路径（“ /lib/x86_64-linux-gnu/libc.so.6”），该路径在整个系统之间可能（而且很可能会有所不同），可以将None（或空字符串）传递给CDLL构造函数（ctypes.CDLL(None).access(b"/tmp", os.F_OK)）。根据[man7]：DLOPEN（3）：
  
  如果filename为NULL，则返回的句柄用于主程序。当给 dlsym（）时，此句柄将在主程序中搜索符号，然后在程序启动时加载所有共享对象，然后在dlopen（）中加载带有标志RTLD_GLOBAL的所有共享对象。
  - 主（当前）程序（ python）与libc链接，因此其符号（包括访问）将加载）
  - 必须小心处理，因为像main，Py_Main这样的函数和（所有）其他功能都可用。打电话给他们可能会造成灾难性的影响（对当前程序）
  - 这也不适用于Win（但是这没什么大不了的，因为msvcrt.dll位于“％SystemRoot％\ System32”中，默认情况下位于％PATH％中）。我想进一步介绍并在Win上复制此行为（并提交补丁），但事实证明，[MS.Docs]：GetProcAddress函数仅“看到” 导出的符号，因此除非有人在主可执行文件中声明该函数如__declspec(dllexport)（为什么地球上普通的人会做的？），主程序加载，但几乎无法使用
安装一些具有文件系统功能的第三方模块

最有可能的，将依赖于上述方法之一（可能需要进行一些自定义）。
一个示例是（再次，特定于Win）[GitHub]：mhammond / pywin32-Windows的Python（pywin32）扩展，它是WINAPI的Python包装器。

但是，由于这更像是一种解决方法，所以我在这里停止。

另一个（lam）解决方法（gainarie）是（我喜欢这样称呼）sysadmin方法：使用Python作为包装器执行Shell命令

获胜：

(py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe\" > nul 2>&1'))"
0

(py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe.notexist\" > nul 2>&1'))"
1

尼克斯（Lnx（Ubtu））：

[cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp\" > /dev/null 2>&1'))"
0
[cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp.notexist\" > /dev/null 2>&1'))"
512

底线：

不要使用试 / 除外 / 其它 / 最后块，因为它们可以防止您遇到一系列令人讨厌的问题。我可以想到的一个反例是性能：此类块非常昂贵，因此请不要将它们放在应该每秒运行数十万次的代码中（但是（在大多数情况下，由于它涉及磁盘访问，事实并非如此）。

最后说明：

我将尝试使其保持最新状态，欢迎提出任何建议，我将结合所有有用的内容，使之成为答案

Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I’ll try to group everything together.

Note: every piece of Python standard library code that I’m going to post, belongs to version 3.5.3.

Problem statement:

Check file (arguable: also folder (“special” file) ?) existence
Don’t use try / except / else / finally blocks

Possible solutions:

[Python 3]: os.path.exists(path) (also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors)
```
os.path.exists(path)
```
Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.

All good, but if following the import tree:
- os.path – posixpath.py (ntpath.py)
  - genericpath.py, line ~#20+
```
def exists(path):
    """Test whether a path exists.  Returns False for broken symbolic links"""
    try:
        st = os.stat(path)
    except os.error:
        return False
    return True
```
it’s just a try / except block around [Python 3]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there’s (at least) one such block. This also applies to other funcs (including os.path.isfile).

1.1. [Python 3]: Path.is_file()
- It’s a fancier (and more pythonic) way of handling paths, but
- Under the hood, it does exactly the same thing (pathlib.py, line ~#1330):
```
def is_file(self):
    """
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).
    """
    try:
        return S_ISREG(self.stat().st_mode)
    except OSError as e:
        if e.errno not in (ENOENT, ENOTDIR):
            raise
        # Path doesn't exist or is a broken symlink
        # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
        return False
```

[Python 3]: With Statement Context Managers. Either:

Create one:

class Swallow:  # Dummy example
    swallowed_exceptions = (FileNotFoundError,)

    def __enter__(self):
        print("Entering...")

    def __exit__(self, exc_type, exc_value, exc_traceback):
        print("Exiting:", exc_type, exc_value, exc_traceback)
        return exc_type in Swallow.swallowed_exceptions  # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)

And its usage – I’ll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):

import os
import stat


def isfile_seaman(path):  # Dummy func
    result = False
    with Swallow():
        result = stat.S_ISREG(os.stat(path).st_mode)
    return result

Use [Python 3]: contextlib.suppress(*exceptions) – which was specifically designed for selectively suppressing exceptions

But, they seem to be wrappers over try / except / else / finally blocks, as [Python 3]: The with statement states:

This allows common try…except…finally usage patterns to be encapsulated for convenient reuse.

Filesystem traversal functions (and search the results for matching item(s))
- [Python 3]: os.listdir(path=’.’) (or [Python 3]: os.scandir(path=’.’) on Python v3.5+, backport: [PyPI]: scandir)
  - Under the hood, both use:
    - Nix: [man7]: OPENDIR(3) / [man7]: READDIR(3) / [man7]: CLOSEDIR(3)
    - Win: [MS.Docs]: FindFirstFileW function / [MS.Docs]: FindNextFileW function / [MS.Docs]: FindClose function
    via [GitHub]: python/cpython – (master) cpython/Modules/posixmodule.c
  Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.
- [Python 3]: os.walk(top, topdown=True, onerror=None, followlinks=False)
  - It uses os.listdir (os.scandir when available)
- [Python 3]: glob.iglob(pathname, *, recursive=False) (or its predecessor: glob.glob)
  - Doesn’t seem a traversing function per se (at least in some cases), but it still uses os.listdir
Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing – as @ShadowRanger pointed out), so I’m not going to insist on them. Not to mention that in some cases, filename processing might be required.
[Python 3]: os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True) whose behavior is close to os.path.exists (actually it’s wider, mainly because of the 2^nd argument)
- user permissions might restrict the file “visibility” as the doc states:
  
  …test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path…
os.access("/tmp", os.F_OK)

Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via “${PYTHON_SRC_DIR}/Modules/posixmodule.c”), but it also opens a gate for possible user errors, and it’s not as Pythonic as other variants. So, as @AaronHall rightly pointed out, don’t use it unless you know what you’re doing:
- Nix: [man7]: ACCESS(2) (!!! pay attention to the note about the security hole its usage might introduce !!!)
- Win: [MS.Docs]: GetFileAttributesW function
Note: calling native APIs is also possible via [Python 3]: ctypes – A foreign function library for Python, but in most cases it’s more complicated.

(Win specific): Since vcruntime* (msvcr*) .dll exports a [MS.Docs]: _access, _waccess function family as well, here’s an example:
```
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, ctypes
>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe", os.F_OK)
0
>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe.notexist", os.F_OK)
-1
```
Notes:
- Although it’s not a good practice, I’m using os.F_OK in the call, but that’s just for clarity (its value is 0)
- I’m using _waccess so that the same code works on Python3 and Python2 (in spite of unicode related differences between them)
- Although this targets a very specific area, it was not mentioned in any of the previous answers
The Lnx (Ubtu (16 x64)) counterpart as well:
```
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, ctypes
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
0
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
-1
```
Notes:
- Instead hardcoding libc‘s path (“/lib/x86_64-linux-gnu/libc.so.6”) which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [man7]: DLOPEN(3):
  
  If filename is NULL, then the returned handle is for the main program. When given to dlsym(), this handle causes a search for a symbol in the main program, followed by all shared objects loaded at program startup, and then all shared objects loaded by dlopen() with the flag RTLD_GLOBAL.
  - Main (current) program (python) is linked against libc, so its symbols (including access) will be loaded
  - This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
  - This doesn’t also apply to Win (but that’s not such a big deal, since msvcrt.dll is located in “%SystemRoot%\System32” which is in %PATH% by default). I wanted to take things further and replicate this behavior on Win (and submit a patch), but as it turns out, [MS.Docs]: GetProcAddress function only “sees” exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the regular person would do that?), the main program is loadable but pretty much unusable
Install some third-party module with filesystem capabilities

Most likely, will rely on one of the ways above (maybe with slight customizations).
One example would be (again, Win specific) [GitHub]: mhammond/pywin32 – Python for Windows (pywin32) Extensions, which is a Python wrapper over WINAPIs.

But, since this is more like a workaround, I’m stopping here.

Another (lame) workaround (gainarie) is (as I like to call it,) the sysadmin approach: use Python as a wrapper to execute shell commands

Win:

(py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe\" > nul 2>&1'))"
0

(py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe.notexist\" > nul 2>&1'))"
1

Nix (Lnx (Ubtu)):

[cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp\" > /dev/null 2>&1'))"
0
[cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp.notexist\" > /dev/null 2>&1'))"
512

Bottom line:

Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems. A counter-example that I can think of, is performance: such blocks are costly, so try not to place them in code that it’s supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won’t be the case).

Final note(s):

I will try to keep it up to date, any suggestions are welcome, I will incorporate anything useful that will come up into the answer

回答 7

这是检查文件是否存在的最简单方法。仅仅因为文件在您检查时存在并不保证在您需要打开文件时该文件就会存在。

import os
fname = "foo.txt"
if os.path.isfile(fname):
    print("file does exist at this time")
else:
    print("no such file exists at this time")

This is the simplest way to check if a file exists. Just because the file existed when you checked doesn’t guarantee that it will be there when you need to open it.

import os
fname = "foo.txt"
if os.path.isfile(fname):
    print("file does exist at this time")
else:
    print("no such file exists at this time")

回答 8

Python 3.4+具有一个面向对象的路径模块：pathlib。使用这个新模块，您可以检查文件是否存在，如下所示：

import pathlib
p = pathlib.Path('path/to/file')
if p.is_file():  # or p.is_dir() to see if it is a directory
    # do stuff

您可以（通常应该）try/except在打开文件时仍然使用块：

try:
    with p.open() as f:
        # do awesome stuff
except OSError:
    print('Well darn.')

pathlib模块中包含很多很棒的东西：方便的globing，检查文件的所有者，更容易的路径连接等。值得一试。如果您使用的是旧版Python（2.6版或更高版本），则仍可以使用pip安装pathlib：

# installs pathlib2 on older Python versions
# the original third-party module, pathlib, is no longer maintained.
pip install pathlib2

然后按如下所示导入它：

# Older Python versions
import pathlib2 as pathlib

Python 3.4+ has an object-oriented path module: pathlib. Using this new module, you can check whether a file exists like this:

import pathlib
p = pathlib.Path('path/to/file')
if p.is_file():  # or p.is_dir() to see if it is a directory
    # do stuff

You can (and usually should) still use a try/except block when opening files:

try:
    with p.open() as f:
        # do awesome stuff
except OSError:
    print('Well darn.')

The pathlib module has lots of cool stuff in it: convenient globbing, checking file’s owner, easier path joining, etc. It’s worth checking out. If you’re on an older Python (version 2.6 or later), you can still install pathlib with pip:

# installs pathlib2 on older Python versions
# the original third-party module, pathlib, is no longer maintained.
pip install pathlib2

Then import it as follows:

# Older Python versions
import pathlib2 as pathlib

回答 9

首选try语句。它被认为是更好的风格，并且避免了比赛条件。

不要相信我。这个理论有很多支持。这是一对：

样式：http：//allendowney.com/sd/notes/notes11.txt的“处理异常情况”部分
避免比赛条件

Prefer the try statement. It’s considered better style and avoids race conditions.

Don’t take my word for it. There’s plenty of support for this theory. Here’s a couple:

Style: Section “Handling unusual conditions” of http://allendowney.com/sd/notes/notes11.txt
Avoiding Race Conditions

回答 10

如何在不使用try语句的情况下使用Python检查文件是否存在？

从Python 3.4开始可用，导入并实例化Path具有文件名的对象，然后检查is_file方法（请注意，对于指向常规文件的符号链接，此方法也返回True）：

>>> from pathlib import Path
>>> Path('/').is_file()
False
>>> Path('/initrd.img').is_file()
True
>>> Path('/doesnotexist').is_file()
False

如果您使用的是Python 2，则可以从pypi，反向移植pathlib模块pathlib2，或者通过其他方式isfile从该os.path模块检查：

>>> import os
>>> os.path.isfile('/')
False
>>> os.path.isfile('/initrd.img')
True
>>> os.path.isfile('/doesnotexist')
False

现在，上面可能是这里最好的实用直接答案，但是有可能出现竞争状况（取决于您要完成的工作），并且底层实现使用try，而Python使用try在实现中无处不在。

因为Python使用 try随处，所以实际上没有理由避免使用它的实现。

但是此答案的其余部分试图考虑这些警告。

更长，更古怪的答案

自Python 3.4起可用，请使用中的新Path对象pathlib。请注意，这.exists不是很正确，因为目录不是文件（在Unix中，一切都是文件）。

>>> from pathlib import Path
>>> root = Path('/')
>>> root.exists()
True

所以我们需要使用is_file：

>>> root.is_file()
False

这是有关的帮助is_file：

is_file(self)
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).

因此，让我们获得一个我们知道是文件的文件：

>>> import tempfile
>>> file = tempfile.NamedTemporaryFile()
>>> filepathobj = Path(file.name)
>>> filepathobj.is_file()
True
>>> filepathobj.exists()
True

默认情况下，NamedTemporaryFile该文件在关闭时删除（并且在没有更多引用时将自动关闭）。

>>> del file
>>> filepathobj.exists()
False
>>> filepathobj.is_file()
False

但是，如果深入研究实现，您将看到它的is_file使用try：

def is_file(self):
    """
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).
    """
    try:
        return S_ISREG(self.stat().st_mode)
    except OSError as e:
        if e.errno not in (ENOENT, ENOTDIR):
            raise
        # Path doesn't exist or is a broken symlink
        # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
        return False

比赛条件：为什么我们喜欢尝试

我们喜欢，try因为它避免了比赛条件。使用try，您只需尝试读取文件，期望它在那里，否则，您将捕获异常并执行有意义的后备行为。

如果要在尝试读取文件之前检查文件是否存在，并且可能要删除它，然后使用多个线程或进程，或者另一个程序知道该文件并可能将其删除，则可能会遇到以下风险：一个竞争条件，如果你检查它的存在，因为你是那么赛车的前打开状态（它的存在）的变化。

竞争条件很难调试，因为存在一个很小的窗口，在竞争窗口中它们可能导致您的程序失败。

但是，如果这是您的动力，则可以获取try使用的语句suppress上下文管理器。

在没有try语句的情况下避免出现竞争情况： `suppress`

Python 3.4为我们提供了suppress上下文管理器（以前称为ignore上下文管理器），它在较少的行中就语义上完全相同，而也（至少在表面上）满足了避免try语句的原始要求：

from contextlib import suppress
from pathlib import Path

用法：

>>> with suppress(OSError), Path('doesnotexist').open() as f:
...     for line in f:
...         print(line)
... 
>>>
>>> with suppress(OSError):
...     Path('doesnotexist').unlink()
... 
>>>

对于较早的Python，您可以自己滚动suppress，但是如果没有，try它将比使用更加冗长。我确实相信这实际上是在try Python 3.4之前可以应用到Python的任何级别的唯一答案，因为它使用上下文管理器代替：

class suppress(object):
    def __init__(self, *exceptions):
        self.exceptions = exceptions
    def __enter__(self):
        return self
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is not None:
            return issubclass(exc_type, self.exceptions)

尝试一下可能会更容易：

from contextlib import contextmanager

@contextmanager
def suppress(*exceptions):
    try:
        yield
    except exceptions:
        pass

不符合要求的其他选项“不尝试”：

文件

import os
os.path.isfile(path)

从文档：

os.path.isfile(path)

如果path是现有的常规文件，则返回True。这是继符号链接，这样既islink()并且isfile()可以为相同的路径是正确的。

但是，如果您检查此函数的来源，您会发现它确实使用了try语句：

# This follows symbolic links, so both islink() and isdir() can be true
# for the same path on systems that support symlinks
def isfile(path):
    """Test whether a path is a regular file"""
    try:
        st = os.stat(path)
    except os.error:
        return False
    return stat.S_ISREG(st.st_mode)

>>> OSError is os.error
True

它所做的就是使用给定的路径查看它是否可以获取统计信息OSError，然后捕获并检查它是否是文件（如果没有引发异常）。

如果您打算对文件进行某些操作，建议您使用try-except直接尝试它，以避免出现竞争情况：

try:
    with open(path) as f:
        f.read()
except OSError:
    pass

os.access

可用于Unix和Windows os.access，但要使用它，必须传递标志，并且不能区分文件和目录。这更用于测试真正的调用用户是否在提升的特权环境中具有访问权限：

import os
os.access(path, os.F_OK)

它也遭受与相同的比赛条件问题isfile。从文档：

注意：在实际使用open（）之前，使用access（）检查用户是否被授权打开文件，这会造成安全漏洞，因为用户可能会利用检查和打开文件之间的较短时间间隔来对其进行操作。最好使用EAFP技术。例如：
if os.access("myfile", os.R_OK):
    with open("myfile") as fp:
        return fp.read()
return "some default data"
最好写成：
try:
    fp = open("myfile")
except IOError as e:
    if e.errno == errno.EACCES:
        return "some default data"
    # Not a permission error.
    raise
else:
    with fp:
        return fp.read()

避免使用 os.access。与上面讨论的较高级别的对象和功能相比，它是一个较低级别的功能，具有更多的用户错误机会。

批评另一个答案：

另一个答案是这样的os.access：

就我个人而言，我更喜欢这个，因为它在后台调用了本机API（通过“ $ {PYTHON_SRC_DIR} /Modules/posixmodule.c”），但是它也为可能的用户错误打开了大门，并且它不像其他变体那样具有Python风格：

这个答案说它偏爱非Pythonic且容易出错的方法，没有理由。似乎鼓励用户使用不了解它们的低级API。

它还创建了一个上下文管理器，通过无条件返回True，它允许所有Exceptions（包括KeyboardInterrupt和SystemExit！）以静默方式传递，这是隐藏bug的好方法。

这似乎鼓励用户采用不良做法。

How do I check whether a file exists, using Python, without using a try statement?

Now available since Python 3.4, import and instantiate a Path object with the file name, and check the is_file method (note that this returns True for symlinks pointing to regular files as well):

>>> from pathlib import Path
>>> Path('/').is_file()
False
>>> Path('/initrd.img').is_file()
True
>>> Path('/doesnotexist').is_file()
False

If you’re on Python 2, you can backport the pathlib module from pypi, pathlib2, or otherwise check isfile from the os.path module:

>>> import os
>>> os.path.isfile('/')
False
>>> os.path.isfile('/initrd.img')
True
>>> os.path.isfile('/doesnotexist')
False

Now the above is probably the best pragmatic direct answer here, but there’s the possibility of a race condition (depending on what you’re trying to accomplish), and the fact that the underlying implementation uses a try, but Python uses try everywhere in its implementation.

Because Python uses try everywhere, there’s really no reason to avoid an implementation that uses it.

But the rest of this answer attempts to consider these caveats.

Longer, much more pedantic answer

Available since Python 3.4, use the new Path object in pathlib. Note that .exists is not quite right, because directories are not files (except in the unix sense that everything is a file).

>>> from pathlib import Path
>>> root = Path('/')
>>> root.exists()
True

So we need to use is_file:

>>> root.is_file()
False

Here’s the help on is_file:

is_file(self)
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).

So let’s get a file that we know is a file:

>>> import tempfile
>>> file = tempfile.NamedTemporaryFile()
>>> filepathobj = Path(file.name)
>>> filepathobj.is_file()
True
>>> filepathobj.exists()
True

By default, NamedTemporaryFile deletes the file when closed (and will automatically close when no more references exist to it).

>>> del file
>>> filepathobj.exists()
False
>>> filepathobj.is_file()
False

If you dig into the implementation, though, you’ll see that is_file uses try:

def is_file(self):
    """
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).
    """
    try:
        return S_ISREG(self.stat().st_mode)
    except OSError as e:
        if e.errno not in (ENOENT, ENOTDIR):
            raise
        # Path doesn't exist or is a broken symlink
        # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
        return False

Race Conditions: Why we like try

We like try because it avoids race conditions. With try, you simply attempt to read your file, expecting it to be there, and if not, you catch the exception and perform whatever fallback behavior makes sense.

If you want to check that a file exists before you attempt to read it, and you might be deleting it and then you might be using multiple threads or processes, or another program knows about that file and could delete it – you risk the chance of a race condition if you check it exists, because you are then racing to open it before its condition (its existence) changes.

Race conditions are very hard to debug because there’s a very small window in which they can cause your program to fail.

But if this is your motivation, you can get the value of a try statement by using the suppress context manager.

Avoiding race conditions without a try statement: `suppress`

Python 3.4 gives us the suppress context manager (previously the ignore context manager), which does semantically exactly the same thing in fewer lines, while also (at least superficially) meeting the original ask to avoid a try statement:

from contextlib import suppress
from pathlib import Path

Usage:

>>> with suppress(OSError), Path('doesnotexist').open() as f:
...     for line in f:
...         print(line)
... 
>>>
>>> with suppress(OSError):
...     Path('doesnotexist').unlink()
... 
>>>

For earlier Pythons, you could roll your own suppress, but without a try will be more verbose than with. I do believe this actually is the only answer that doesn’t use try at any level in the Python that can be applied to prior to Python 3.4 because it uses a context manager instead:

class suppress(object):
    def __init__(self, *exceptions):
        self.exceptions = exceptions
    def __enter__(self):
        return self
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is not None:
            return issubclass(exc_type, self.exceptions)

Perhaps easier with a try:

from contextlib import contextmanager

@contextmanager
def suppress(*exceptions):
    try:
        yield
    except exceptions:
        pass

Other options that don’t meet the ask for “without try”:

isfile

import os
os.path.isfile(path)

from the docs:

os.path.isfile(path)

Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.

But if you examine the source of this function, you’ll see it actually does use a try statement:

# This follows symbolic links, so both islink() and isdir() can be true
# for the same path on systems that support symlinks
def isfile(path):
    """Test whether a path is a regular file"""
    try:
        st = os.stat(path)
    except os.error:
        return False
    return stat.S_ISREG(st.st_mode)

>>> OSError is os.error
True

All it’s doing is using the given path to see if it can get stats on it, catching OSError and then checking if it’s a file if it didn’t raise the exception.

If you intend to do something with the file, I would suggest directly attempting it with a try-except to avoid a race condition:

try:
    with open(path) as f:
        f.read()
except OSError:
    pass

os.access

Available for Unix and Windows is os.access, but to use you must pass flags, and it does not differentiate between files and directories. This is more used to test if the real invoking user has access in an elevated privilege environment:

import os
os.access(path, os.F_OK)

It also suffers from the same race condition problems as isfile. From the docs:

Note: Using access() to check if a user is authorized to e.g. open a file before actually doing so using open() creates a security hole, because the user might exploit the short time interval between checking and opening the file to manipulate it. It’s preferable to use EAFP techniques. For example:
if os.access("myfile", os.R_OK):
    with open("myfile") as fp:
        return fp.read()
return "some default data"
is better written as:
try:
    fp = open("myfile")
except IOError as e:
    if e.errno == errno.EACCES:
        return "some default data"
    # Not a permission error.
    raise
else:
    with fp:
        return fp.read()

Avoid using os.access. It is a low level function that has more opportunities for user error than the higher level objects and functions discussed above.

Criticism of another answer:

Another answer says this about os.access:

Personally, I prefer this one because under the hood, it calls native APIs (via “${PYTHON_SRC_DIR}/Modules/posixmodule.c”), but it also opens a gate for possible user errors, and it’s not as Pythonic as other variants:

This answer says it prefers a non-Pythonic, error-prone method, with no justification. It seems to encourage users to use low-level APIs without understanding them.

It also creates a context manager which, by unconditionally returning True, allows all Exceptions (including KeyboardInterrupt and SystemExit!) to pass silently, which is a good way to hide bugs.

This seems to encourage users to adopt poor practices.

回答 11

import os
#Your path here e.g. "C:\Program Files\text.txt"
#For access purposes: "C:\\Program Files\\text.txt"
if os.path.exists("C:\..."):   
    print "File found!"
else:
    print "File not found!"

导入os使您可以更轻松地在操作系统中导航和执行标准操作。

供参考，请参阅如何使用Python检查文件是否存在？

如果需要高级操作，请使用shutil。

import os
#Your path here e.g. "C:\Program Files\text.txt"
#For access purposes: "C:\\Program Files\\text.txt"
if os.path.exists("C:\..."):   
    print "File found!"
else:
    print "File not found!"

Importing os makes it easier to navigate and perform standard actions with your operating system.

For reference also see How to check whether a file exists using Python?

If you need high-level operations, use shutil.

回答 12

测试的文件和文件夹os.path.isfile()，os.path.isdir()并os.path.exists()

假定“路径”是有效路径，此表显示了每个函数对文件和文件夹返回的内容：

您还可以测试文件是否是os.path.splitext()用于获取扩展名的特定类型的文件（如果您还不知道的话）

>>> import os
>>> path = "path to a word document"
>>> os.path.isfile(path)
True
>>> os.path.splitext(path)[1] == ".docx" # test if the extension is .docx
True

Testing for files and folders with os.path.isfile(), os.path.isdir() and os.path.exists()

Assuming that the “path” is a valid path, this table shows what is returned by each function for files and folders:

You can also test if a file is a certain type of file using os.path.splitext() to get the extension (if you don’t already know it)

>>> import os
>>> path = "path to a word document"
>>> os.path.isfile(path)
True
>>> os.path.splitext(path)[1] == ".docx" # test if the extension is .docx
True

回答 13

在2016年，最好的方法仍然是使用os.path.isfile：

>>> os.path.isfile('/path/to/some/file.txt')

或者在Python 3中，您可以使用pathlib：

import pathlib
path = pathlib.Path('/path/to/some/file.txt')
if path.is_file():
    ...

In 2016 the best way is still using os.path.isfile:

>>> os.path.isfile('/path/to/some/file.txt')

Or in Python 3 you can use pathlib:

import pathlib
path = pathlib.Path('/path/to/some/file.txt')
if path.is_file():
    ...

回答 14

在try / except和之间似乎没有有意义的功能差异isfile()，因此您应该使用哪个才有意义。

如果要读取文件（如果存在），请执行

try:
    f = open(filepath)
except IOError:
    print 'Oh dear.'

但是，如果您只是想重命名文件（如果存在），因此不需要打开它，请执行

if os.path.isfile(filepath):
    os.rename(filepath, filepath + '.old')

如果要写入文件（如果不存在），请执行

# python 2
if not os.path.isfile(filepath):
    f = open(filepath, 'w')

# python 3, x opens for exclusive creation, failing if the file already exists
try:
    f = open(filepath, 'wx')
except IOError:
    print 'file already exists'

如果您需要文件锁定，那是另一回事。

It doesn’t seem like there’s a meaningful functional difference between try/except and isfile(), so you should use which one makes sense.

If you want to read a file, if it exists, do

try:
    f = open(filepath)
except IOError:
    print 'Oh dear.'

But if you just wanted to rename a file if it exists, and therefore don’t need to open it, do

if os.path.isfile(filepath):
    os.rename(filepath, filepath + '.old')

If you want to write to a file, if it doesn’t exist, do

# python 2
if not os.path.isfile(filepath):
    f = open(filepath, 'w')

# python 3, x opens for exclusive creation, failing if the file already exists
try:
    f = open(filepath, 'wx')
except IOError:
    print 'file already exists'

If you need file locking, that’s a different matter.

回答 15

您可以尝试这样做（更安全）：

try:
    # http://effbot.org/zone/python-with-statement.htm
    # 'with' is safer to open a file
    with open('whatever.txt') as fh:
        # Do something with 'fh'
except IOError as e:
    print("({})".format(e))

输出为：

（[Errno 2]没有这样的文件或目录：’whatever.txt’）

然后，根据结果，您的程序可以仅从那里继续运行，也可以编写代码以停止它。

You could try this (safer):

try:
    # http://effbot.org/zone/python-with-statement.htm
    # 'with' is safer to open a file
    with open('whatever.txt') as fh:
        # Do something with 'fh'
except IOError as e:
    print("({})".format(e))

The ouput would be:

([Errno 2] No such file or directory: ‘whatever.txt’)

Then, depending on the result, your program can just keep running from there or you can code to stop it if you want.

回答 16

尽管我总是建议使用try和except语句，但是这里有几种可能（我个人最喜欢使用os.access）：

尝试打开文件：

打开文件将始终验证文件是否存在。您可以像下面这样创建一个函数：
```
def File_Existence(filepath):
    f = open(filepath)
    return True
```
如果为False，它将在更高版本的Python中以未处理的IOError或OSError停止执行。要捕获异常，您必须使用tryexcept子句。当然，您总是可以try像这样使用except`语句（感谢hsandt 让我思考）：
```
def File_Existence(filepath):
    try:
        f = open(filepath)
    except IOError, OSError: # Note OSError is for later versions of Python
        return False

    return True
```

用途os.path.exists(path)：

这将检查您指定的内容是否存在。但是，它会检查文件和目录，因此请注意如何使用它们。

import os.path
>>> os.path.exists("this/is/a/directory")
True
>>> os.path.exists("this/is/a/file.txt")
True
>>> os.path.exists("not/a/directory")
False

用途os.access(path, mode)：

这将检查您是否有权访问该文件。它将检查权限。根据os.py文档，输入os.F_OK，它将检查路径的存在。但是，使用此方法会创建一个安全漏洞，因为有人可以使用检查权限到打开文件之间的时间来攻击您的文件。您应该直接打开文件，而不要检查其权限。（EAFP与LBYP）。如果您以后不打算打开文件，而仅检查其存在，则可以使用它。

无论如何，在这里：
```
>>> import os
>>> os.access("/is/a/file.txt", os.F_OK)
True
```

我还应该提到，有两种方法将使您无法验证文件的存在。问题将是permission denied或no such file or directory。如果您发现IOError，请设置IOError as e（像我的第一个选项一样），然后键入print(e.args)以便希望确定问题。希望对您有所帮助！:)

Although I always recommend using try and except statements, here are a few possibilities for you (my personal favourite is using os.access):

Try opening the file:

Opening the file will always verify the existence of the file. You can make a function just like so:
```
def File_Existence(filepath):
    f = open(filepath)
    return True
```
If it’s False, it will stop execution with an unhanded IOError or OSError in later versions of Python. To catch the exception, you have to use a try except clause. Of course, you can always use a try except` statement like so (thanks to hsandt for making me think):
```
def File_Existence(filepath):
    try:
        f = open(filepath)
    except IOError, OSError: # Note OSError is for later versions of Python
        return False

    return True
```

Use os.path.exists(path):

This will check the existence of what you specify. However, it checks for files and directories so beware about how you use it.

import os.path
>>> os.path.exists("this/is/a/directory")
True
>>> os.path.exists("this/is/a/file.txt")
True
>>> os.path.exists("not/a/directory")
False

Use os.access(path, mode):

This will check whether you have access to the file. It will check for permissions. Based on the os.py documentation, typing in os.F_OK, it will check the existence of the path. However, using this will create a security hole, as someone can attack your file using the time between checking the permissions and opening the file. You should instead go directly to opening the file instead of checking its permissions. (EAFP vs LBYP). If you’re not going to open the file afterwards, and only checking its existence, then you can use this.

Anyway, here:
```
>>> import os
>>> os.access("/is/a/file.txt", os.F_OK)
True
```

I should also mention that there are two ways that you will not be able to verify the existence of a file. Either the issue will be permission denied or no such file or directory. If you catch an IOError, set the IOError as e (like my first option), and then type in print(e.args) so that you can hopefully determine your issue. I hope it helps! :)

回答 17

日期：2017-12-04

每种可能的解决方案都已在其他答案中列出。

一种检查文件是否存在的直观且可参数的方法如下：

import os
os.path.isfile('~/file.md')  # Returns True if exists, else False
# additionaly check a dir
os.path.isdir('~/folder')  # Returns True if the folder exists, else False
# check either a dir or a file
os.path.exists('~/file')

我做了详尽的备忘单供您参考：

#os.path methods in exhaustive cheatsheet
{'definition': ['dirname',
               'basename',
               'abspath',
               'relpath',
               'commonpath',
               'normpath',
               'realpath'],
'operation': ['split', 'splitdrive', 'splitext',
               'join', 'normcase'],
'compare': ['samefile', 'sameopenfile', 'samestat'],
'condition': ['isdir',
              'isfile',
              'exists',
              'lexists'
              'islink',
              'isabs',
              'ismount',],
 'expand': ['expanduser',
            'expandvars'],
 'stat': ['getatime', 'getctime', 'getmtime',
          'getsize']}

Date:2017-12-04

Every possible solution has been listed in other answers.

An intuitive and arguable way to check if a file exists is the following:

import os
os.path.isfile('~/file.md')  # Returns True if exists, else False
# additionaly check a dir
os.path.isdir('~/folder')  # Returns True if the folder exists, else False
# check either a dir or a file
os.path.exists('~/file')

I made an exhaustive cheatsheet for your reference:

#os.path methods in exhaustive cheatsheet
{'definition': ['dirname',
               'basename',
               'abspath',
               'relpath',
               'commonpath',
               'normpath',
               'realpath'],
'operation': ['split', 'splitdrive', 'splitext',
               'join', 'normcase'],
'compare': ['samefile', 'sameopenfile', 'samestat'],
'condition': ['isdir',
              'isfile',
              'exists',
              'lexists'
              'islink',
              'isabs',
              'ismount',],
 'expand': ['expanduser',
            'expandvars'],
 'stat': ['getatime', 'getctime', 'getmtime',
          'getsize']}

回答 18

如果该文件用于打开，则可以使用以下技术之一：

with open('somefile', 'xt') as f: #Using the x-flag, Python3.3 and above
    f.write('Hello\n')

if not os.path.exists('somefile'): 
    with open('somefile', 'wt') as f:
        f.write("Hello\n")
else:
    print('File already exists!')

更新

为了避免混淆，并根据我得到的答案，当前答案会找到具有给定名称的文件或目录。

If the file is for opening you could use one of the following techniques:

with open('somefile', 'xt') as f: #Using the x-flag, Python3.3 and above
    f.write('Hello\n')

if not os.path.exists('somefile'): 
    with open('somefile', 'wt') as f:
        f.write("Hello\n")
else:
    print('File already exists!')

UPDATE

Just to avoid confusion and based on the answers I got, current answer finds either a file or a directory with the given name.

回答 19

另外，os.access()：

if os.access("myfile", os.R_OK):
    with open("myfile") as fp:
        return fp.read()

作为R_OK，W_OK和X_OK标志，以测试权限（DOC）。

Additionally, os.access():

if os.access("myfile", os.R_OK):
    with open("myfile") as fp:
        return fp.read()

Being R_OK, W_OK, and X_OK the flags to test for permissions (doc).

回答 20

if os.path.isfile(path_to_file):
    try: 
        open(path_to_file)
            pass
    except IOError as e:
        print "Unable to open file"

引发异常被认为是程序中流控制的可接受且Pythonic的方法。考虑使用IOErrors处理丢失的文件。在这种情况下，如果文件存在但用户没有读取权限，则将引发IOError异常。

SRC：http：//www.pfinn.net/python-check-if-file-exists.html

if os.path.isfile(path_to_file):
    try: 
        open(path_to_file)
            pass
    except IOError as e:
        print "Unable to open file"

Raising exceptions is considered to be an acceptable, and Pythonic, approach for flow control in your program. Consider handling missing files with IOErrors. In this situation, an IOError exception will be raised if the file exists but the user does not have read permissions.

SRC: http://www.pfinn.net/python-check-if-file-exists.html

回答 21

如果导入与NumPy已经用于其它用途，则没有必要导入其他库，例如pathlib，os，paths等。

import numpy as np
np.DataSource().exists("path/to/your/file")

这将根据其存在返回true或false。

If you imported NumPy already for other purposes then there is no need to import other libraries like pathlib, os, paths, etc.

import numpy as np
np.DataSource().exists("path/to/your/file")

This will return true or false based on its existence.

回答 22

您可以在不使用的情况下写下Brian的建议try:。

from contextlib import suppress

with suppress(IOError), open('filename'):
    process()

suppress是Python 3.4的一部分。在较早的发行版中，您可以快速编写自己的隐匿：

from contextlib import contextmanager

@contextmanager
def suppress(*exceptions):
    try:
        yield
    except exceptions:
        pass

You can write Brian’s suggestion without the try:.

from contextlib import suppress

with suppress(IOError), open('filename'):
    process()

suppress is part of Python 3.4. In older releases you can quickly write your own suppress:

from contextlib import contextmanager

@contextmanager
def suppress(*exceptions):
    try:
        yield
    except exceptions:
        pass

回答 23

我是一个已经存在大约十年的软件包的作者，它的功能可以直接解决这个问题。基本上，如果您使用的是非Windows系统，则使用Popen可以访问find。但是，如果您使用的是Windows，它将find使用高效的文件系统walker 复制。

该代码本身不使用try块……除非确定操作系统，然后使您转向“ Unix”风格find或手动编译find。时序测试表明，try确定操作系统的速度更快，因此我确实在那儿使用了它（但没有其他地方）。

>>> import pox
>>> pox.find('*python*', type='file', root=pox.homedir(), recurse=False)
['/Users/mmckerns/.python']

还有文件

>>> print pox.find.__doc__
find(patterns[,root,recurse,type]); Get path to a file or directory

    patterns: name or partial name string of items to search for
    root: path string of top-level directory to search
    recurse: if True, recurse down from root directory
    type: item filter; one of {None, file, dir, link, socket, block, char}
    verbose: if True, be a little verbose about the search

    On some OS, recursion can be specified by recursion depth (an integer).
    patterns can be specified with basic pattern matching. Additionally,
    multiple patterns can be specified by splitting patterns with a ';'
    For example:
        >>> find('pox*', root='..')
        ['/Users/foo/pox/pox', '/Users/foo/pox/scripts/pox_launcher.py']

        >>> find('*shutils*;*init*')
        ['/Users/foo/pox/pox/shutils.py', '/Users/foo/pox/pox/__init__.py']

>>>

如果您愿意看一下的话，可以在这里找到实现：https : //github.com/uqfoundation/pox/blob/89f90fb308f285ca7a62eabe2c38acb87e89dad9/pox/shutils.py#L190

I’m the author of a package that’s been around for about 10 years, and it has a function that addresses this question directly. Basically, if you are on a non-Windows system, it uses Popen to access find. However, if you are on Windows, it replicates find with an efficient filesystem walker.

The code itself does not use a try block… except in determining the operating system and thus steering you to the “Unix”-style find or the hand-buillt find. Timing tests showed that the try was faster in determining the OS, so I did use one there (but nowhere else).

>>> import pox
>>> pox.find('*python*', type='file', root=pox.homedir(), recurse=False)
['/Users/mmckerns/.python']

And the doc…

>>> print pox.find.__doc__
find(patterns[,root,recurse,type]); Get path to a file or directory

    patterns: name or partial name string of items to search for
    root: path string of top-level directory to search
    recurse: if True, recurse down from root directory
    type: item filter; one of {None, file, dir, link, socket, block, char}
    verbose: if True, be a little verbose about the search

    On some OS, recursion can be specified by recursion depth (an integer).
    patterns can be specified with basic pattern matching. Additionally,
    multiple patterns can be specified by splitting patterns with a ';'
    For example:
        >>> find('pox*', root='..')
        ['/Users/foo/pox/pox', '/Users/foo/pox/scripts/pox_launcher.py']

        >>> find('*shutils*;*init*')
        ['/Users/foo/pox/pox/shutils.py', '/Users/foo/pox/pox/__init__.py']

>>>

The implementation, if you care to look, is here: https://github.com/uqfoundation/pox/blob/89f90fb308f285ca7a62eabe2c38acb87e89dad9/pox/shutils.py#L190

回答 24

检查文件或目录是否存在

您可以遵循以下三种方式：

注意1：os.path.isfile仅用于文件

import os.path
os.path.isfile(filename) # True if file exists
os.path.isfile(dirname) # False if directory exists

注意2：os.path.exists用于文件和目录

import os.path
os.path.exists(filename) # True if file exists
os.path.exists(dirname) #True if directory exists

该pathlib.Path方法（包含在Python 3+中，可通过pip安装在Python 2中）

from pathlib import Path
Path(filename).exists()

Check file or directory exists

You can follow these three ways:

Note1: The os.path.isfile used only for files

import os.path
os.path.isfile(filename) # True if file exists
os.path.isfile(dirname) # False if directory exists

Note2: The os.path.exists used for both files and directories

import os.path
os.path.exists(filename) # True if file exists
os.path.exists(dirname) #True if directory exists

The pathlib.Path method (included in Python 3+, installable with pip for Python 2)

from pathlib import Path
Path(filename).exists()

回答 25

再添加一个细微的变化，而其他变化未完全反映出来。

这将处理file_path存在None或为空字符串的情况。

def file_exists(file_path):
    if not file_path:
        return False
    elif not os.path.isfile(file_path):
        return False
    else:
        return True

根据Shahbaz的建议添加变体

def file_exists(file_path):
    if not file_path:
        return False
    else:
        return os.path.isfile(file_path)

根据Peter Wood的建议添加变体

def file_exists(file_path):
    return file_path and os.path.isfile(file_path):

Adding one more slight variation which isn’t exactly reflected in the other answers.

This will handle the case of the file_path being None or empty string.

def file_exists(file_path):
    if not file_path:
        return False
    elif not os.path.isfile(file_path):
        return False
    else:
        return True

Adding a variant based on suggestion from Shahbaz

def file_exists(file_path):
    if not file_path:
        return False
    else:
        return os.path.isfile(file_path)

Adding a variant based on suggestion from Peter Wood

def file_exists(file_path):
    return file_path and os.path.isfile(file_path):

回答 26

这是用于Linux命令行环境的1行Python命令。我觉得这个非常好，因为我不是一个很酷的Bash家伙。

python -c "import os.path; print os.path.isfile('/path_to/file.xxx')"

我希望这是有帮助的。

Here’s a 1 line Python command for the Linux command line environment. I find this VERY HANDY since I’m not such a hot Bash guy.

python -c "import os.path; print os.path.isfile('/path_to/file.xxx')"

I hope this is helpful.

回答 27

您可以使用Python的“ OS”库：

>>> import os
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.txt") 
True
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.tx")
False

You can use the “OS” library of Python:

>>> import os
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.txt") 
True
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.tx")
False

回答 28

如何在不使用try语句的情况下检查文件是否存在？

在2016年，可以说这仍然是检查文件是否存在和是否是文件的最简单方法：

import os
os.path.isfile('./file.txt')    # Returns True if exists, else False

isfile实际上只是内部使用os.stat和内部使用的一种辅助方法stat.S_ISREG(mode)。这os.stat是一个较低层的方法，它将为您提供有关文件，目录，套接字，缓冲区等的详细信息。有关os.stat的更多信息

注意：但是，这种方法不会以任何方式锁定文件，因此您的代码可能容易受到“ 检查时间到使用时间 ”（TOCTTOU）错误的攻击。

因此，引发异常被认为是程序中流控制的可接受且Pythonic的方法。而且，应该考虑使用IOErrors处理丢失的文件，而不是使用if语句（只是建议）。

How do I check whether a file exists, without using the try statement?

In 2016, this is still arguably the easiest way to check if both a file exists and if it is a file:

import os
os.path.isfile('./file.txt')    # Returns True if exists, else False

isfile is actually just a helper method that internally uses os.stat and stat.S_ISREG(mode) underneath. This os.stat is a lower-level method that will provide you with detailed information about files, directories, sockets, buffers, and more. More about os.stat here

Note: However, this approach will not lock the file in any way and therefore your code can become vulnerable to “time of check to time of use” (TOCTTOU) bugs.

So raising exceptions is considered to be an acceptable, and Pythonic, approach for flow control in your program. And one should consider handling missing files with IOErrors, rather than if statements (just an advice).

回答 29

import os.path

def isReadableFile(file_path, file_name):
    full_path = file_path + "/" + file_name
    try:
        if not os.path.exists(file_path):
            print "File path is invalid."
            return False
        elif not os.path.isfile(full_path):
            print "File does not exist."
            return False
        elif not os.access(full_path, os.R_OK):
            print "File cannot be read."
            return False
        else:
            print "File can be read."
            return True
    except IOError as ex:
        print "I/O error({0}): {1}".format(ex.errno, ex.strerror)
    except Error as ex:
        print "Error({0}): {1}".format(ex.errno, ex.strerror)
    return False
#------------------------------------------------------

path = "/usr/khaled/documents/puzzles"
fileName = "puzzle_1.txt"

isReadableFile(path, fileName)

import os.path

def isReadableFile(file_path, file_name):
    full_path = file_path + "/" + file_name
    try:
        if not os.path.exists(file_path):
            print "File path is invalid."
            return False
        elif not os.path.isfile(full_path):
            print "File does not exist."
            return False
        elif not os.access(full_path, os.R_OK):
            print "File cannot be read."
            return False
        else:
            print "File can be read."
            return True
    except IOError as ex:
        print "I/O error({0}): {1}".format(ex.errno, ex.strerror)
    except Error as ex:
        print "Error({0}): {1}".format(ex.errno, ex.strerror)
    return False
#------------------------------------------------------

path = "/usr/khaled/documents/puzzles"
fileName = "puzzle_1.txt"

isReadableFile(path, fileName)