Delete an entire directory tree; path must point to a directory (but not a symbolic link to a directory). If ignore_errors is true, errors resulting from failed removals will be ignored; if false or omitted, such errors are handled by calling a handler specified by onerror or, if that is omitted, they raise an exception.
import os
import shutil
for root, dirs, files in os.walk('/path/to/folder'):for f in files:
os.unlink(os.path.join(root, f))for d in dirs:
shutil.rmtree(os.path.join(root, d))
Expanding on mhawke’s answer this is what I’ve implemented. It removes all the content of a folder but not the folder itself. Tested on Linux with files, folders and symbolic links, should work on Windows as well.
import os
import shutil
for root, dirs, files in os.walk('/path/to/folder'):
for f in files:
os.unlink(os.path.join(root, f))
for d in dirs:
shutil.rmtree(os.path.join(root, d))
Using rmtree and recreating the folder could work, but I have run into errors when deleting and immediately recreating folders on network drives.
The proposed solution using walk does not work as it uses rmtree to remove folders and then may attempt to use os.unlink on the files that were previously in those folders. This causes an error.
The posted glob solution will also attempt to delete non-empty folders, causing errors.
I suggest you use:
folder_path = '/path/to/folder'
for file_object in os.listdir(folder_path):
file_object_path = os.path.join(folder_path, file_object)
if os.path.isfile(file_object_path) or os.path.islink(file_object_path):
os.unlink(file_object_path)
else:
shutil.rmtree(file_object_path)
回答 5
这个:
删除所有符号链接
无效链接
链接到目录
链接到文件
删除子目录
不删除父目录
码:
for filename in os.listdir(dirpath):
filepath = os.path.join(dirpath, filename)try:
shutil.rmtree(filepath)exceptOSError:
os.remove(filepath)
for filename in os.listdir(dirpath):
filepath = os.path.join(dirpath, filename)
try:
shutil.rmtree(filepath)
except OSError:
os.remove(filepath)
As many other answers, this does not try to adjust permissions to enable removal of files/directories.
回答 6
作为单线:
import os
# Python 2.7
map( os.unlink,(os.path.join( mydir,f)for f in os.listdir(mydir)))# Python 3+
list( map( os.unlink,(os.path.join( mydir,f)for f in os.listdir(mydir))))
一个考虑文件和目录的更健壮的解决方案是(2.7):
def rm(f):if os.path.isdir(f):return os.rmdir(f)if os.path.isfile(f):return os.unlink(f)raiseTypeError,'must be either file or directory'
map( rm,(os.path.join( mydir,f)for f in os.listdir(mydir)))
import os
# Python 2.7
map( os.unlink, (os.path.join( mydir,f) for f in os.listdir(mydir)) )
# Python 3+
list( map( os.unlink, (os.path.join( mydir,f) for f in os.listdir(mydir)) ) )
A more robust solution accounting for files and directories as well would be (2.7):
def rm(f):
if os.path.isdir(f): return os.rmdir(f)
if os.path.isfile(f): return os.unlink(f)
raise TypeError, 'must be either file or directory'
map( rm, (os.path.join( mydir,f) for f in os.listdir(mydir)) )
import os
import stat
import shutil
# http://stackoverflow.com/questions/1889597/deleting-directory-in-pythondef _remove_readonly(fn, path_, excinfo):# Handle read-only files and directoriesif fn is os.rmdir:
os.chmod(path_, stat.S_IWRITE)
os.rmdir(path_)elif fn is os.remove:
os.lchmod(path_, stat.S_IWRITE)
os.remove(path_)def force_remove_file_or_symlink(path_):try:
os.remove(path_)exceptOSError:
os.lchmod(path_, stat.S_IWRITE)
os.remove(path_)# Code from shutil.rmtree()def is_regular_dir(path_):try:
mode = os.lstat(path_).st_mode
except os.error:
mode =0return stat.S_ISDIR(mode)def clear_dir(path_):if is_regular_dir(path_):# Given path is a directory, clear its contentfor name in os.listdir(path_):
fullpath = os.path.join(path_, name)if is_regular_dir(fullpath):
shutil.rmtree(fullpath, onerror=_remove_readonly)else:
force_remove_file_or_symlink(fullpath)else:# Given path is a file or a symlink.# Raise an exception here to avoid accidentally clearing the content# of a symbolic linked directory.raiseOSError("Cannot call clear_dir() on a symbolic link")
Notes: in case someone down voted my answer, I have something to explain here.
Everyone likes short ‘n’ simple answers. However, sometimes the reality is not so simple.
Back to my answer. I know shutil.rmtree() could be used to delete a directory tree. I’ve used it many times in my own projects. But you must realize that the directory itself will also be deleted by shutil.rmtree(). While this might be acceptable for some, it’s not a valid answer for deleting the contents of a folder (without side effects).
I’ll show you an example of the side effects. Suppose that you have a directory with customized owner and mode bits, where there are a lot of contents. Then you delete it with shutil.rmtree() and rebuild it with os.mkdir(). And you’ll get an empty directory with default (inherited) owner and mode bits instead. While you might have the privilege to delete the contents and even the directory, you might not be able to set back the original owner and mode bits on the directory (e.g. you’re not a superuser).
Finally, be patient and read the code. It’s long and ugly (in sight), but proven to be reliable and efficient (in use).
Here’s a long and ugly, but reliable and efficient solution.
It resolves a few problems which are not addressed by the other answerers:
It correctly handles symbolic links, including not calling shutil.rmtree() on a symbolic link (which will pass the os.path.isdir() test if it links to a directory; even the result of os.walk() contains symbolic linked directories as well).
It handles read-only files nicely.
Here’s the code (the only useful function is clear_dir()):
import os
import stat
import shutil
# http://stackoverflow.com/questions/1889597/deleting-directory-in-python
def _remove_readonly(fn, path_, excinfo):
# Handle read-only files and directories
if fn is os.rmdir:
os.chmod(path_, stat.S_IWRITE)
os.rmdir(path_)
elif fn is os.remove:
os.lchmod(path_, stat.S_IWRITE)
os.remove(path_)
def force_remove_file_or_symlink(path_):
try:
os.remove(path_)
except OSError:
os.lchmod(path_, stat.S_IWRITE)
os.remove(path_)
# Code from shutil.rmtree()
def is_regular_dir(path_):
try:
mode = os.lstat(path_).st_mode
except os.error:
mode = 0
return stat.S_ISDIR(mode)
def clear_dir(path_):
if is_regular_dir(path_):
# Given path is a directory, clear its content
for name in os.listdir(path_):
fullpath = os.path.join(path_, name)
if is_regular_dir(fullpath):
shutil.rmtree(fullpath, onerror=_remove_readonly)
else:
force_remove_file_or_symlink(fullpath)
else:
# Given path is a file or a symlink.
# Raise an exception here to avoid accidentally clearing the content
# of a symbolic linked directory.
raise OSError("Cannot call clear_dir() on a symbolic link")
回答 8
我感到惊讶的是,没有人提到pathlib做这项工作很棒。
如果您只想删除目录中的文件,则可以将其作为一个文件
from pathlib importPath[f.unlink()for f inPath("/path/to/folder").glob("*")if f.is_file()]
要还递归地删除目录,您可以编写如下内容:
from pathlib importPathfrom shutil import rmtree
for path inPath("/path/to/folder").glob("**/*"):if path.is_file():
path.unlink()elif path.is_dir():
rmtree(path)
I’m surprised nobody has mentioned the awesome pathlib to do this job.
If you only want to remove files in a directory it can be a oneliner
from pathlib import Path
[f.unlink() for f in Path("/path/to/folder").glob("*") if f.is_file()]
To also recursively remove directories you can write something like this:
from pathlib import Path
from shutil import rmtree
for path in Path("/path/to/folder").glob("**/*"):
if path.is_file():
path.unlink()
elif path.is_dir():
rmtree(path)
回答 9
import os
import shutil
# Gather directory contents
contents =[os.path.join(target_dir, i)for i in os.listdir(target_dir)]# Iterate and remove each item in the appropriate manner[os.remove(i)if os.path.isfile(i)or os.path.islink(i)else shutil.rmtree(i)for i in contents]
较早的注释还提到在Python 3.5+中使用os.scandir。例如:
import os
import shutil
with os.scandir(target_dir)as entries:for entry in entries:if entry.is_file()or entry.is_symlink():
os.remove(entry.path)elif entry.is_dir():
shutil.rmtree(entry.path)
import os
import shutil
# Gather directory contents
contents = [os.path.join(target_dir, i) for i in os.listdir(target_dir)]
# Iterate and remove each item in the appropriate manner
[os.remove(i) if os.path.isfile(i) or os.path.islink(i) else shutil.rmtree(i) for i in contents]
An earlier comment also mentions using os.scandir in Python 3.5+. For example:
import os
import shutil
with os.scandir(target_dir) as entries:
for entry in entries:
if entry.is_file() or entry.is_symlink():
os.remove(entry.path)
elif entry.is_dir():
shutil.rmtree(entry.path)
os.listdir() doesn’t distinguish files from directories and you will quickly get into trouble trying to unlink these. There is a good example of using os.walk() to recursively remove a directory here, and hints on how to adapt it to your circumstances.
回答 11
我曾经通过这种方式解决问题:
import shutil
import os
shutil.rmtree(dirpath)
os.mkdir(dirpath)
def emptydir(top):if(top =='/'or top =="\\"):returnelse:for root, dirs, files in os.walk(top, topdown=False):for name in files:
os.remove(os.path.join(root, name))for name in dirs:
os.rmdir(os.path.join(root, name))
I konw it’s an old thread but I have found something interesting from the official site of python. Just for sharing another idea for removing of all contents in a directory. Because I have some problems of authorization when using shutil.rmtree() and I don’t want to remove the directory and recreate it. The address original is http://docs.python.org/2/library/os.html#os.walk. Hope that could help someone.
def emptydir(top):
if(top == '/' or top == "\\"): return
else:
for root, dirs, files in os.walk(top, topdown=False):
for name in files:
os.remove(os.path.join(root, name))
for name in dirs:
os.rmdir(os.path.join(root, name))
回答 14
要删除目录及其子目录中的所有文件而不删除文件夹本身,只需执行以下操作:
import os
mypath ="my_folder"#Enter your path herefor root, dirs, files in os.walk(mypath):for file in files:
os.remove(os.path.join(root, file))
I resolved the issue with rmtreemakedirs by adding time.sleep() between:
if os.path.isdir(folder_location):
shutil.rmtree(folder_location)
time.sleep(.5)
os.makedirs(folder_location, 0o777)
回答 20
回答有限的特定情况:假设您要在维护子文件夹树时删除文件,则可以使用递归算法:
import os
def recursively_remove_files(f):if os.path.isfile(f):
os.unlink(f)elif os.path.isdir(f):for fi in os.listdir(f):
recursively_remove_files(os.path.join(f, fi))
recursively_remove_files(my_directory)
Answer for a limited, specific situation:
assuming you want to delete the files while maintainig the subfolders tree, you could use a recursive algorithm:
import os
def recursively_remove_files(f):
if os.path.isfile(f):
os.unlink(f)
elif os.path.isdir(f):
for fi in os.listdir(f):
recursively_remove_files(os.path.join(f, fi))
recursively_remove_files(my_directory)
Maybe slightly off-topic, but I think many would find it useful
回答 21
假设temp_dir要删除,使用的单行命令os将是:
_ =[os.remove(os.path.join(save_dir,i))for i in os.listdir(temp_dir)]
Use the method bellow to remove the contents of a directory, not the directory itself:
import os
import shutil
def remove_contents(path):
for c in os.listdir(path):
full_path = os.path.join(path, c)
if os.path.isfile(full_path):
os.remove(full_path)
else:
shutil.rmtree(full_path)
回答 23
删除文件夹中的所有文件/删除所有文件的最简单方法
import os
files = os.listdir(yourFilePath)for f in files:
os.remove(yourFilePath + f)
import os
import platform
def creation_date(path_to_file):"""
Try to get the date that a file was created, falling back to when it was
last modified if that isn't possible.
See http://stackoverflow.com/a/39501288/1709587 for explanation.
"""if platform.system()=='Windows':return os.path.getctime(path_to_file)else:
stat = os.stat(path_to_file)try:return stat.st_birthtime
exceptAttributeError:# We're probably on Linux. No easy way to get creation dates here,# so we'll settle for when its content was last modified.return stat.st_mtime
Getting some sort of modification date in a cross-platform way is easy – just call os.path.getmtime(path) and you’ll get the Unix timestamp of when the file at path was last modified.
Getting file creation dates, on the other hand, is fiddly and platform-dependent, differing even between the three big OSes:
On Mac, as well as some other Unix-based OSes, you can use the .st_birthtime attribute of the result of a call to os.stat().
On Linux, this is currently impossible, at least without writing a C extension for Python. Although some file systems commonly used with Linux do store creation dates (for example, ext4 stores them in st_crtime) , the Linux kernel offers no way of accessing them; in particular, the structs it returns from stat() calls in C, as of the latest kernel version, don’t contain any creation date fields. You can also see that the identifier st_crtime doesn’t currently feature anywhere in the Python source. At least if you’re on ext4, the data is attached to the inodes in the file system, but there’s no convenient way of accessing it.
The next-best thing on Linux is to access the file’s mtime, through either os.path.getmtime() or the .st_mtime attribute of an os.stat() result. This will give you the last time the file’s content was modified, which may be adequate for some use cases.
Putting this all together, cross-platform code should look something like this…
import os
import platform
def creation_date(path_to_file):
"""
Try to get the date that a file was created, falling back to when it was
last modified if that isn't possible.
See http://stackoverflow.com/a/39501288/1709587 for explanation.
"""
if platform.system() == 'Windows':
return os.path.getctime(path_to_file)
else:
stat = os.stat(path_to_file)
try:
return stat.st_birthtime
except AttributeError:
# We're probably on Linux. No easy way to get creation dates here,
# so we'll settle for when its content was last modified.
return stat.st_mtime
Note: ctime() does not refer to creation time on *nix systems, but rather the last time the inode data changed. (thanks to kojiro for making that fact more clear in the comments by providing a link to an interesting blog post)
edit: In newer code you should probably use os.path.getmtime() (thanks Christian Oudard)
but note that it returns a floating point value of time_t with fraction seconds (if your OS supports it)
getmtime(path) Return the time of last modification of path. The return value is a number giving the
number of seconds since the epoch (see the time module). Raise os.error if the file does
not exist or is inaccessible. New in version 1.5.2. Changed in version 2.3: If
os.stat_float_times() returns True, the result is a floating point number.
stat(path) Perform a stat() system call on the given path. The return value is an object whose
attributes correspond to the members of the stat structure, namely: st_mode (protection
bits), st_ino (inode number), st_dev (device), st_nlink (number of hard links), st_uid
(user ID of owner), st_gid (group ID of owner), st_size (size of file, in bytes),
st_atime (time of most recent access), st_mtime (time of most recent content
modification), st_ctime (platform dependent; time of most recent metadata change on Unix, or the time of creation on Windows):
In Python 3.4 and above, you can use the object oriented pathlib module interface which includes wrappers for much of the os module. Here is an example of getting the file stats.
os.stat returns a named tuple with st_mtime and st_ctime attributes. The modification time is st_mtime on both platforms; unfortunately, on Windows, ctime means “creation time”, whereas on POSIX it means “change time”. I’m not aware of any way to get the creation time on POSIX platforms.
somefile.txt
Modified
1429613446
1429613446.0
1429613446.0
Created
1517491049
1517491049.28306
1517491049.28306
Date modified: Tue Apr 21 11:50:46 2015
Date modified: 2015-04-21 11:50:46
Date modified: 21/04/2015 11:50:46
Date created: Thu Feb 1 13:17:29 2018
Date created: 2018-02-01 13:17:29.283060
Date created: 01/02/2018 13:17:29
回答 8
>>>import os
>>> os.stat('feedparser.py').st_mtime
1136961142.0>>> os.stat('feedparser.py').st_ctime
1222664012.233>>>
from crtime import get_crtimes_in_dir
for fname, date in get_crtimes_in_dir(".", raise_on_error=True, as_epoch=False):print(fname, date)# file_a.py Mon Mar 18 20:51:18 CET 2019
It may worth taking a look at the crtime library which implements cross-platform access to the file creation time.
from crtime import get_crtimes_in_dir
for fname, date in get_crtimes_in_dir(".", raise_on_error=True, as_epoch=False):
print(fname, date)
# file_a.py Mon Mar 18 20:51:18 CET 2019
import os
import shutil
os.rename("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
shutil.move("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
os.replace("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
Note that you must include the file name (file.foo) in both the source and destination arguments. If it is changed, the file will be renamed as well as moved.
Note also that in the first two cases the directory in which the new file is being created must already exist. On Windows, a file with that name must not exist or an exception will be raised, but os.replace() will silently replace a file even in that occurrence.
As has been noted in comments on other answers, shutil.move simply calls os.rename in most cases. However, if the destination is on a different disk than the source, it will instead copy and then delete the source file.
Although os.rename() and shutil.move() will both rename files, the command that is closest to the Unix mv command is shutil.move(). The difference is that os.rename() doesn’t work if the source and destination are on different disks, while shutil.move() doesn’t care what disk the files are on.
import os, shutil
path = "/volume1/Users/Transfer/"
moveto = "/volume1/Users/Drive_Transfer/"
files = os.listdir(path)
files.sort()
for f in files:
src = path+f
dst = moveto+f
shutil.move(src,dst)
Now fully functional. Hope this helps you.
Edit:
I’ve turned this into a function, that accepts a source and destination directory, making the destination folder if it doesn’t exist, and moves the files. Also allows for filtering of the src files, for example if you only want to move images, then you use the pattern '*.jpg', by default, it moves everything in the directory
import os, shutil, pathlib, fnmatch
def move_dir(src: str, dst: str, pattern: str = '*'):
if not os.path.isdir(dst):
pathlib.Path(dst).mkdir(parents=True, exist_ok=True)
for f in fnmatch.filter(os.listdir(src), pattern):
shutil.move(os.path.join(src, f), os.path.join(dst, f))
The accepted answer is not the right one, because the question is not about renaming a file into a file, but moving many files into a directory. shutil.move will do the work, but for this purpose os.rename is useless (as stated on comments) because destination must have an explicit file name.
I am curious to know the pro’s and con’s of this method compared to shutil. Since in my case I am already using subprocess for other reasons and it seems to work I am inclined to stick with it.
Is it system dependent maybe?
回答 7
这是解决方案,无法shell使用mv。
import subprocess
source ='pathToCurrent/file.foo'
destination ='pathToNew/file.foo'
p = subprocess.Popen(['mv', source, destination], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
res = p.communicate()[0].decode('utf-8').strip()if p.returncode:print'ERROR: '+ res
I am getting an ‘access is denied’ error when I attempt to delete a folder that is not empty. I used the following command in my attempt: os.remove("/folder_name").
What is the most effective way of removing/deleting a folder/directory that is not empty?
By design, rmtree fails on folder trees containing read-only files. If you want the folder to be deleted regardless of whether it contains read-only files, then use
# Delete everything reachable from the directory named in 'top',# assuming there are no symbolic links.# CAUTION: This is dangerous! For example, if top == '/', it# could delete all your disk files.import osfor root, dirs, files in os.walk(top, topdown=False):for name in files:
os.remove(os.path.join(root, name))for name in dirs:
os.rmdir(os.path.join(root, name))
# Delete everything reachable from the directory named in 'top',
# assuming there are no symbolic links.
# CAUTION: This is dangerous! For example, if top == '/', it
# could delete all your disk files.
import os
for root, dirs, files in os.walk(top, topdown=False):
for name in files:
os.remove(os.path.join(root, name))
for name in dirs:
os.rmdir(os.path.join(root, name))
import pathlib
def delete_folder(pth):for sub in pth.iterdir():if sub.is_dir():
delete_folder(sub)else:
sub.unlink()
pth.rmdir()# if you just want to delete dir content, remove this line
import pathlib
def delete_folder(pth) :
for sub in pth.iterdir() :
if sub.is_dir() :
delete_folder(sub)
else :
sub.unlink()
pth.rmdir() # if you just want to delete dir content, remove this line
where pth is a pathlib.Path instance. Nice, but may not be the fastest.
import os, stat
import shutil
def remove_readonly(func, path, _):"Clear the readonly bit and reattempt the removal"
os.chmod(path, stat.S_IWRITE)
func(path)
shutil.rmtree(directory, onerror=remove_readonly)
This example shows how to remove a directory tree on Windows where
some of the files have their read-only bit set. It uses the onerror
callback to clear the readonly bit and reattempt the remove. Any
subsequent failure will propagate.
import os, stat
import shutil
def remove_readonly(func, path, _):
"Clear the readonly bit and reattempt the removal"
os.chmod(path, stat.S_IWRITE)
func(path)
shutil.rmtree(directory, onerror=remove_readonly)
回答 5
import os
import stat
import shutil
def errorRemoveReadonly(func, path, exc):
excvalue = exc[1]if func in(os.rmdir, os.remove)and excvalue.errno == errno.EACCES:# change the file to be readable,writable,executable: 0777
os.chmod(path, stat.S_IRWXU | stat.S_IRWXG | stat.S_IRWXO)# retry
func(path)else:# raiseenter code here
shutil.rmtree(path, ignore_errors=False, onerror=errorRemoveReadonly)
import os
import stat
import shutil
def errorRemoveReadonly(func, path, exc):
excvalue = exc[1]
if func in (os.rmdir, os.remove) and excvalue.errno == errno.EACCES:
# change the file to be readable,writable,executable: 0777
os.chmod(path, stat.S_IRWXU | stat.S_IRWXG | stat.S_IRWXO)
# retry
func(path)
else:
# raiseenter code here
shutil.rmtree(path, ignore_errors=False, onerror=errorRemoveReadonly)
If ignore_errors is set, errors are ignored; otherwise, if onerror is set, it is called to handle the error with arguments (func, path, exc_info) where func is os.listdir, os.remove, or os.rmdir; path is the argument to that function that caused it to fail; and exc_info is a tuple returned by sys.exc_info(). If ignore_errors is false and onerror is None, an exception is raised.enter code here
回答 6
根据kkubasik的回答,删除之前检查文件夹是否存在,更可靠
import shutil
def remove_folder(path):# check if folder existsif os.path.exists(path):# remove if exists
shutil.rmtree(path)else:# throw your exception to handle this special scenarioraiseXXError("your exception")
remove_folder("/folder_name")
Base on kkubasik’s answer, check if folder exists before remove, more robust
import shutil
def remove_folder(path):
# check if folder exists
if os.path.exists(path):
# remove if exists
shutil.rmtree(path)
else:
# throw your exception to handle this special scenario
raise XXError("your exception")
remove_folder("/folder_name")
if you are sure, that you want to delete the entire dir tree, and are no more interested in contents of dir, then crawling for entire dir tree is stupidness… just call native OS command from python to do that. It will be faster, efficient and less memory consuming.
RMDIR c:\blah /s /q
or *nix
rm -rf /home/whatever
In python, the code will look like..
import sys
import os
mswindows = (sys.platform == "win32")
def getstatusoutput(cmd):
"""Return (status, output) of executing cmd in a shell."""
if not mswindows:
return commands.getstatusoutput(cmd)
pipe = os.popen(cmd + ' 2>&1', 'r')
text = pipe.read()
sts = pipe.close()
if sts is None: sts = 0
if text[-1:] == '\n': text = text[:-1]
return sts, text
def deleteDir(path):
"""deletes the path entirely"""
if mswindows:
cmd = "RMDIR "+ path +" /s /q"
else:
cmd = "rm -rf "+path
result = getstatusoutput(cmd)
if(result[0]!=0):
raise RuntimeError(result[1])
回答 8
只需一些python 3.5选项即可完成上述答案。(我很想在这里找到他们)。
import os
import shutil
from send2trash import send2trash # (shutil delete permanently)
elif subdirs ==[]and len(files)==1:# if contains no sub folder and only 1 file if files[0]=="desktop.ini"or:
send2trash(dir)print(dir,": folder removed")else:print(dir)
如果仅包含.srt或.txt文件,则删除文件夹
elif subdirs ==[]:#if dir doesn’t contains subdirectory
ext =(".srt",".txt")
contains_other_ext=0for file in files:ifnot file.endswith(ext):
contains_other_ext=Trueif contains_other_ext==0:
send2trash(dir)print(dir,": dir deleted")
删除小于400kb的文件夹:
def get_tree_size(path):"""Return total size of files in given path and subdirs."""
total =0for entry in os.scandir(path):if entry.is_dir(follow_symlinks=False):
total += get_tree_size(entry.path)else:
total += entry.stat(follow_symlinks=False).st_size
return total
for dir, subdirs, files in os.walk(root):If get_tree_size(dir)<400000:# ≈ 400kb
send2trash(dir)print(dir,"dir deleted")
Just some python 3.5 options to complete the answers above. (I would have loved to find them here).
import os
import shutil
from send2trash import send2trash # (shutil delete permanently)
Delete folder if empty
root = r"C:\Users\Me\Desktop\test"
for dir, subdirs, files in os.walk(root):
if subdirs == [] and files == []:
send2trash(dir)
print(dir, ": folder removed")
Delete also folder if it contains this file
elif subdirs == [] and len(files) == 1: # if contains no sub folder and only 1 file
if files[0]== "desktop.ini" or:
send2trash(dir)
print(dir, ": folder removed")
else:
print(dir)
delete folder if it contains only .srt or .txt file(s)
elif subdirs == []: #if dir doesn’t contains subdirectory
ext = (".srt", ".txt")
contains_other_ext=0
for file in files:
if not file.endswith(ext):
contains_other_ext=True
if contains_other_ext== 0:
send2trash(dir)
print(dir, ": dir deleted")
Delete folder if its size is less than 400kb :
def get_tree_size(path):
"""Return total size of files in given path and subdirs."""
total = 0
for entry in os.scandir(path):
if entry.is_dir(follow_symlinks=False):
total += get_tree_size(entry.path)
else:
total += entry.stat(follow_symlinks=False).st_size
return total
for dir, subdirs, files in os.walk(root):
If get_tree_size(dir) < 400000: # ≈ 400kb
send2trash(dir)
print(dir, "dir deleted")
回答 9
我想添加“纯路径库”方法:
from pathlib importPathfrom typing importUniondef del_dir(target:Union[Path, str], only_if_empty: bool =False):
target =Path(target).expanduser()assert target.is_dir()for p in sorted(target.glob('**/*'), reverse=True):ifnot p.exists():continue
p.chmod(0o666)if p.is_dir():
p.rmdir()else:if only_if_empty:raiseRuntimeError(f'{p.parent} is not empty!')
p.unlink()
target.rmdir()
from pathlib import Path
from typing import Union
def del_dir(target: Union[Path, str], only_if_empty: bool = False):
target = Path(target).expanduser()
assert target.is_dir()
for p in sorted(target.glob('**/*'), reverse=True):
if not p.exists():
continue
p.chmod(0o666)
if p.is_dir():
p.rmdir()
else:
if only_if_empty:
raise RuntimeError(f'{p.parent} is not empty!')
p.unlink()
target.rmdir()
This relies on the fact that Path is orderable, and longer paths will always sort after shorter paths, just like str. Therefore, directories will come before files. If we reverse the sort, files will then come before their respective containers, so we can simply unlink/rmdir them one by one with one pass.
Benefits:
It’s NOT relying on external binaries: everything uses Python’s batteries-included modules (Python >= 3.6)
It’s fast and memory-efficient: No recursion stack, no need to start a subprocess
It’s cross-platform (at least, that’s what pathlib promises in Python 3.6; no operation above stated to not run on Windows)
If needed, one can do a very granular logging, e.g., log each deletion as it happens.
回答 10
def deleteDir(dirPath):
deleteFiles =[]
deleteDirs =[]for root, dirs, files in os.walk(dirPath):for f in files:
deleteFiles.append(os.path.join(root, f))for d in dirs:
deleteDirs.append(os.path.join(root, d))for f in deleteFiles:
os.remove(f)for d in deleteDirs:
os.rmdir(d)
os.rmdir(dirPath)
def deleteDir(dirPath):
deleteFiles = []
deleteDirs = []
for root, dirs, files in os.walk(dirPath):
for f in files:
deleteFiles.append(os.path.join(root, f))
for d in dirs:
deleteDirs.append(os.path.join(root, d))
for f in deleteFiles:
os.remove(f)
for d in deleteDirs:
os.rmdir(d)
os.rmdir(dirPath)
回答 11
如果您不想使用该shutil模块,则可以使用该os模块。
from os import listdir, rmdir, remove
for i in listdir(directoryToRemove):
os.remove(os.path.join(directoryToRemove, i))
rmdir(directoryToRemove)# Now the directory is empty of files
If you don’t want to use the shutil module you can just use the os module.
from os import listdir, rmdir, remove
for i in listdir(directoryToRemove):
os.remove(os.path.join(directoryToRemove, i))
rmdir(directoryToRemove) # Now the directory is empty of files
Essentially it’s using Python’s subprocess module to run the bash script $ rm -rf '/path/to/your/dir as if you were using the terminal to accomplish the same task. It’s not fully Python, but it gets it done.
The reason I included the pathlib.Path example is because in my experience it’s very useful when dealing with many paths that change. The extra steps of importing the pathlib.Path module and converting the end results to strings is often a lower cost to me for development time. It would be convenient if Path.rmdir() came with an arg option to explicitly handle non-empty dirs.
To delete a folder even if it might not exist (avoiding the race condition in Charles Chow’s answer) but still have errors when other things go wrong (e.g. permission problems, disk read error, the file isn’t a directory)
python -c "import sys; import os; [os.chmod(os.path.join(rs,d), 0o777) for rs,ds,fs in os.walk(_path_) for d in ds]"
python -c "import sys; import os; [os.chmod(os.path.join(rs,f), 0o777) for rs,ds,fs in os.walk(_path_) for f in fs]"
python -c "import os; import shutil; shutil.rmtree(_path_, ignore_errors=False)"
With os.walk I would propose the solution which consists of 3 one-liner Python calls:
python -c "import sys; import os; [os.chmod(os.path.join(rs,d), 0o777) for rs,ds,fs in os.walk(_path_) for d in ds]"
python -c "import sys; import os; [os.chmod(os.path.join(rs,f), 0o777) for rs,ds,fs in os.walk(_path_) for f in fs]"
python -c "import os; import shutil; shutil.rmtree(_path_, ignore_errors=False)"
The first script chmod’s all sub-directories, the second script chmod’s all files. Then the third script removes everything with no impediments.
I have tested this from the “Shell Script” in a Jenkins job (I did not want to store a new Python script into SCM, that’s why searched for a one-line solution) and it worked for Linux and Windows.
I am writing a Python script in Windows. I want to do something based on the file size. For example, if the size is greater than 0, I will send an email to somebody, otherwise continue to other things.
The other answers work for real files, but if you need something that works for “file-like objects”, try this:
# f is a file-like object.
f.seek(0, os.SEEK_END)
size = f.tell()
It works for real files and StringIO’s, in my limited testing. (Python 2.7.3.) The “file-like object” API isn’t really a rigorous interface, of course, but the API documentation suggests that file-like objects should support seek() and tell().
Edit
Another difference between this and os.stat() is that you can stat() a file even if you don’t have permission to read it. Obviously the seek/tell approach won’t work unless you have read permission.
Edit 2
At Jonathon’s suggestion, here’s a paranoid version. (The version above leaves the file pointer at the end of the file, so if you were to try to read from the file, you’d get zero bytes back!)
# f is a file-like object.
old_file_position = f.tell()
f.seek(0, os.SEEK_END)
size = f.tell()
f.seek(old_file_position, os.SEEK_SET)
回答 3
import osdef convert_bytes(num):"""
this function will convert bytes to MB.... GB... etc
"""for x in['bytes','KB','MB','GB','TB']:if num <1024.0:return"%3.1f %s"%(num, x)
num /=1024.0def file_size(file_path):"""
this function will return the file size
"""if os.path.isfile(file_path):
file_info = os.stat(file_path)return convert_bytes(file_info.st_size)# Lets check the file size of MS Paint exe # or you can use any file path
file_path = r"C:\Windows\System32\mspaint.exe"print file_size(file_path)
import os
def convert_bytes(num):
"""
this function will convert bytes to MB.... GB... etc
"""
for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
if num < 1024.0:
return "%3.1f %s" % (num, x)
num /= 1024.0
def file_size(file_path):
"""
this function will return the file size
"""
if os.path.isfile(file_path):
file_info = os.stat(file_path)
return convert_bytes(file_info.st_size)
# Lets check the file size of MS Paint exe
# or you can use any file path
file_path = r"C:\Windows\System32\mspaint.exe"
print file_size(file_path)
There is a bitshift trick I use if I want to to convert from bytes to any other unit. If you do a right shift by 10 you basically shift it by an order (multiple).
Strictly sticking to the question, the Python code (+ pseudo-code) would be:
import os
file_path = r"<path to your file>"
if os.stat(file_path).st_size > 0:
<send an email to somebody>
else:
<continue to other things>
回答 7
#Get file size , print it , process it...#Os.stat will provide the file size in (.st_size) property. #The file size will be shown in bytes.import os
fsize=os.stat('filepath')print('size:'+ fsize.st_size.__str__())#check if the file size is less than 10 MBif fsize.st_size <10000000:
process it ....
#Get file size , print it , process it...
#Os.stat will provide the file size in (.st_size) property.
#The file size will be shown in bytes.
import os
fsize=os.stat('filepath')
print('size:' + fsize.st_size.__str__())
#check if the file size is less than 10 MB
if fsize.st_size < 10000000:
process it ....
we have two options Both include importing os module
1)
import os
as os.stat() function returns an object which contains so many headers including file created time and last modified time etc.. among them st_size() gives the exact size of the file.
os.stat(“filename”).st_size()
2)
import os
In this, we have to provide the exact file path(absolute path), not a relative path.
with open('your_file.txt', 'w') as f:
for item in my_list:
f.write("%s\n" % item)
In Python 2, you can also use
with open('your_file.txt', 'w') as f:
for item in my_list:
print >> f, item
If you’re keen on a single function call, at least remove the square brackets [], so that the strings to be printed get made one at a time (a genexp rather than a listcomp) — no reason to take up all the memory required to materialize the whole list of strings.
In[1]:import os
In[2]: f = file(os.devnull,"w")In[3]:%timeit f.writelines("%s\n"% item for item in xrange(2**20))1 loops, best of 3:385 ms per loop
In[4]:%timeit f.writelines(["%s\n"% item for item in xrange(2**20)])
ERROR:InternalPython error in the inspect module.Belowis the traceback from this internal error.Traceback(most recent call last):...MemoryError
In[4]:%timeit f.writelines("%s\n"% item for item in xrange(2**20))1 loops, best of 3:370 ms per loop
In[5]:%timeit f.writelines(["%s\n"% item for item in xrange(2**20)])1 loops, best of 3:360 ms per loop
I thought it would be interesting to explore the benefits of using a genexp, so here’s my take.
The example in the question uses square brackets to create a temporary list, and so is equivalent to:
file.writelines( list( "%s\n" % item for item in list ) )
Which needlessly constructs a temporary list of all the lines that will be written out, this may consume significant amounts of memory depending on the size of your list and how verbose the output of str(item) is.
Drop the square brackets (equivalent to removing the wrapping list() call above) will instead pass a temporary generator to file.writelines():
file.writelines( "%s\n" % item for item in list )
This generator will create newline-terminated representation of your item objects on-demand (i.e. as they are written out). This is nice for a couple of reasons:
Memory overheads are small, even for very large lists
If str(item) is slow there’s visible progress in the file as each item is processed
This avoids memory issues, such as:
In [1]: import os
In [2]: f = file(os.devnull, "w")
In [3]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 385 ms per loop
In [4]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.
Traceback (most recent call last):
...
MemoryError
(I triggered this error by limiting Python’s max. virtual memory to ~100MB with ulimit -v 102400).
Putting memory usage to one side, this method isn’t actually any faster than the original:
In [4]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 370 ms per loop
In [5]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
1 loops, best of 3: 360 ms per loop
(Python 2.6.2 on Linux)
回答 6
因为我很懒…
import json
a =[1,2,3]with open('test.txt','w')as f:
f.write(json.dumps(a))#Now read the file back into a Python list objectwith open('test.txt','r')as f:
a = json.loads(f.read())
import json
a = [1,2,3]
with open('test.txt', 'w') as f:
f.write(json.dumps(a))
#Now read the file back into a Python list object
with open('test.txt', 'r') as f:
a = json.loads(f.read())
Serialize list into text file with comma sepparated value
mylist = dir()
with open('filename.txt','w') as f:
f.write( ','.join( mylist ) )
回答 8
一般来说
以下是writelines()方法的语法
fileObject.writelines( sequence )
例
#!/usr/bin/python# Open a file
fo = open("foo.txt","rw+")
seq =["This is 6th line\n","This is 7th line"]# Write sequence of lines at the end of the file.
line = fo.writelines( seq )# Close opend file
fo.close()
#!/usr/bin/python
# Open a file
fo = open("foo.txt", "rw+")
seq = ["This is 6th line\n", "This is 7th line"]
# Write sequence of lines at the end of the file.
line = fo.writelines( seq )
# Close opend file
fo.close()
outfile = open('outfile.txt','w')# open a file in write modefor item in list_to_persistence:# iterate over the list items
outfile.write(str(item)+'\n')# write to the file
outfile.close()# close the file
This logic will first convert the items in list to string(str). Sometimes the list contains a tuple like
alist = [(i12,tiger),
(113,lion)]
This logic will write to file each tuple in a new line. We can later use eval while loading each tuple when reading the file:
outfile = open('outfile.txt', 'w') # open a file in write mode
for item in list_to_persistence: # iterate over the list items
outfile.write(str(item) + '\n') # write to the file
outfile.close() # close the file
回答 14
迭代和添加换行符的另一种方法:
for item in items:
filewriter.write(f"{item}"+"\n")
In [29]: a = n.array((avg))
In [31]: a.tofile('avgpoints.dat',sep='\n',dtype = '%f')
You can use %e or %s depending on your requirement.
回答 18
poem ='''\
Programming is fun
When the work is done
if you wanna make your work also fun:
use Python!
'''
f = open('poem.txt','w')# open for 'w'riting
f.write(poem)# write text to file
f.close()# close the file
poem = '''\
Programming is fun
When the work is done
if you wanna make your work also fun:
use Python!
'''
f = open('poem.txt', 'w') # open for 'w'riting
f.write(poem) # write text to file
f.close() # close the file
How It Works:
First, open a file by using the built-in open function and specifying the name of
the file and the mode in which we want to open the file. The mode can be a
read mode (’r’), write mode (’w’) or append mode (’a’). We can also specify
whether we are reading, writing, or appending in text mode (’t’) or binary
mode (’b’). There are actually many more modes available and help(open)
will give you more details about them. By default, open() considers the file to
be a ’t’ext file and opens it in ’r’ead mode.
In our example, we first open the file in write text mode and use the write
method of the file object to write to the file and then we finally close the file.
The above example is from the book “A Byte of Python” by Swaroop C H.swaroopch.com
shutil has many methods you can use. One of which is:
from shutil import copyfile
copyfile(src, dst)
Copy the contents of the file named src to a file named dst.
The destination location must be writable; otherwise, an IOError exception will be raised.
If dst already exists, it will be replaced.
Special files such as character or block devices and pipes cannot be copied with this function.
With copy, src and dst are path names given as strings.
If you use os.path operations, use copy rather than copyfile. copyfile will only accept strings.
回答 1
┌──────────────────┬────────┬───────────┬───────┬────────────────┐
│ Function │ Copies │ Copies │Can use│ Destination │
│ │metadata│permissions│buffer │may be directory│
├──────────────────┼────────┼───────────┼───────┼────────────────┤
│shutil.copy │ No │ Yes │ No │ Yes │
│shutil.copyfile │ No │ No │ No │ No │
│shutil.copy2 │ Yes │ Yes │ No │ Yes │
│shutil.copyfileobj│ No │ No │ Yes │ No │
└──────────────────┴────────┴───────────┴───────┴────────────────┘
┌──────────────────┬────────┬───────────┬───────┬────────────────┐
│ Function │ Copies │ Copies │Can use│ Destination │
│ │metadata│permissions│buffer │may be directory│
├──────────────────┼────────┼───────────┼───────┼────────────────┤
│shutil.copy │ No │ Yes │ No │ Yes │
│shutil.copyfile │ No │ No │ No │ No │
│shutil.copy2 │ Yes │ Yes │ No │ Yes │
│shutil.copyfileobj│ No │ No │ Yes │ No │
└──────────────────┴────────┴───────────┴───────┴────────────────┘
os.popen(cmd[, mode[, bufsize]])# example# In Unix/Linux
os.popen('cp source.txt destination.txt')# In Windows
os.popen('copy source.txt destination.txt')
subprocess.call(args,*, stdin=None, stdout=None, stderr=None, shell=False)# example (WARNING: setting `shell=True` might be a security-risk)# In Linux/Unix
status = subprocess.call('cp source.txt destination.txt', shell=True)# In Windows
status = subprocess.call('copy source.txt destination.txt', shell=True)
subprocess.check_output(args,*, stdin=None, stderr=None, shell=False, universal_newlines=False)# example (WARNING: setting `shell=True` might be a security-risk)# In Linux/Unix
status = subprocess.check_output('cp source.txt destination.txt', shell=True)# In Windows
status = subprocess.check_output('copy source.txt destination.txt', shell=True)
os.popen(cmd[, mode[, bufsize]])
# example
# In Unix/Linux
os.popen('cp source.txt destination.txt')
# In Windows
os.popen('copy source.txt destination.txt')
subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.call('cp source.txt destination.txt', shell=True)
# In Windows
status = subprocess.call('copy source.txt destination.txt', shell=True)
subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)
# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.check_output('cp source.txt destination.txt', shell=True)
# In Windows
status = subprocess.check_output('copy source.txt destination.txt', shell=True)
def copyfileobj_example(source, dest, buffer_size=1024*1024):"""
Copy a file from source to dest. source and dest
must be file-like objects, i.e. any object with a read or
write method, like for example StringIO.
"""whileTrue:
copy_buffer = source.read(buffer_size)ifnot copy_buffer:break
dest.write(copy_buffer)
如果要按文件名复制,可以执行以下操作:
def copyfile_example(source, dest):# Beware, this example does not handle any edge cases!with open(source,'rb')as src, open(dest,'wb')as dst:
copyfileobj_example(src, dst)
Copying a file is a relatively straightforward operation as shown by the examples below, but you should instead use the shutil stdlib module for that.
def copyfileobj_example(source, dest, buffer_size=1024*1024):
"""
Copy a file from source to dest. source and dest
must be file-like objects, i.e. any object with a read or
write method, like for example StringIO.
"""
while True:
copy_buffer = source.read(buffer_size)
if not copy_buffer:
break
dest.write(copy_buffer)
If you want to copy by filename you could do something like this:
def copyfile_example(source, dest):
# Beware, this example does not handle any edge cases!
with open(source, 'rb') as src, open(dest, 'wb') as dst:
copyfileobj_example(src, dst)
Copy the contents of the file named src to a file named dst. The destination location must be writable; otherwise, an IOError exception will be raised. If dst already exists, it will be replaced. Special files such as character or block devices and pipes cannot be copied with this function. src and dst are path names given as strings.
Take a look at filesys for all the file and directory handling functions available in standard Python modules.
shutil.copyfileobj(fsrc, fdst[, length]) manipulate opened objects
In [3]: src = '~/Documents/Head+First+SQL.pdf'
In [4]: dst = '~/desktop'
In [5]: shutil.copyfileobj(src, dst)
AttributeError: 'str' object has no attribute 'read'
#copy the file object
In [7]: with open(src, 'rb') as f1,open(os.path.join(dst,'test.pdf'), 'wb') as f2:
...: shutil.copyfileobj(f1, f2)
In [8]: os.stat(os.path.join(dst,'test.pdf'))
Out[8]: os.stat_result(st_mode=33188, st_ino=8598319475, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067347, st_mtime=1516067335, st_ctime=1516067345)
shutil.copyfile(src, dst, *, follow_symlinks=True) Copy and rename
In [9]: shutil.copyfile(src, dst)
IsADirectoryError: [Errno 21] Is a directory: ~/desktop'
#so dst should be a filename instead of a directory name
For small files and using only python built-ins, you can use the following one-liner:
with open(source, 'rb') as src, open(dest, 'wb') as dst: dst.write(src.read())
As @maxschlepzig mentioned in the comments below, this is not optimal way for applications where the file is too large or when memory is critical, thus Swati’s answer should be preferred.
For large files, what I did was read the file line by line and read each line into an array. Then, once the array reached a certain size, append it to a new file.
for line in open("file.txt", "r"):
list.append(line)
if len(list) == 1000000:
output.writelines(list)
del list[:]
回答 12
from subprocess import call
call("cp -p <file> <file>", shell=True)
How do I read every line of a file in Python and store each line as an element in a list?
I want to read the file line by line and append each line to the end of the list.
回答 0
with open(filename)as f:
content = f.readlines()# you may also want to remove whitespace characters like `\n` at the end of each line
content =[x.strip()for x in content]
with open(filename) as f:
content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
content = [x.strip() for x in content]
This will yield an “array” of lines from the file.
lines = tuple(open(filename, 'r'))
open returns a file which can be iterated over. When you iterate over a file, you get the lines from that file. tuple can take an iterator and instantiate a tuple instance for you from the iterator that you give it. lines is a tuple created from the lines of the file.
回答 4
如果要\n包括在内:
with open(fname)as f:
content = f.readlines()
如果你不想 \n包括:
with open(fname)as f:
content = f.read().splitlines()
You could simply do the following, as has been suggested:
with open('/your/path/file') as f:
my_lines = f.readlines()
Note that this approach has 2 downsides:
1) You store all the lines in memory. In the general case, this is a very bad idea. The file could be very large, and you could run out of memory. Even if it’s not large, it is simply a waste of memory.
2) This does not allow processing of each line as you read them. So if you process your lines after this, it is not efficient (requires two passes rather than one).
A better approach for the general case would be the following:
with open('/your/path/file') as f:
for line in f:
process(line)
Where you define your process function any way you want. For example:
def process(line):
if 'save the world' in line.lower():
superman.save_the_world()
(The implementation of the Superman class is left as an exercise for you).
This will work nicely for any file size and you go through your file in just 1 pass. This is typically how generic parsers will work.
回答 7
数据入列表
假设我们有一个文本文件,其数据如下行所示,
文字档内容:
line 1
line 2
line 3
在同一目录中打开cmd(右键单击鼠标,然后选择cmd或PowerShell)
运行python并在解释器中编写:
Python脚本:
>>>with open("myfile.txt", encoding="utf-8")as file:... x =[l.strip()for l in file]>>> x
['line 1','line 2','line 3']
使用追加:
x =[]with open("myfile.txt")as file:for l in file:
x.append(l.strip())
要么:
>>> x = open("myfile.txt").read().splitlines()>>> x
['line 1','line 2','line 3']
要么:
>>> x = open("myfile.txt").readlines()>>> x
['linea 1\n','line 2\n','line 3\n']
要么:
>>> y =[x.rstrip()for x in open("my_file.txt")]>>> y
['line 1','line 2','line 3']with open('testodiprova.txt','r', encoding='utf-8')as file:
file = file.read().splitlines()print(file)with open('testodiprova.txt','r', encoding='utf-8')as file:
file = file.readlines()print(file)
Assume that we have a text file with our data like in the following lines,
Text file content:
line 1
line 2
line 3
Open the cmd in the same directory (right-click the mouse and choose cmd or PowerShell)
Run python and in the interpreter write:
The Python script:
>>> with open("myfile.txt", encoding="utf-8") as file:
... x = [l.strip() for l in file]
>>> x
['line 1','line 2','line 3']
Using append:
x = []
with open("myfile.txt") as file:
for l in file:
x.append(l.strip())
Or:
>>> x = open("myfile.txt").read().splitlines()
>>> x
['line 1', 'line 2', 'line 3']
Or:
>>> x = open("myfile.txt").readlines()
>>> x
['linea 1\n', 'line 2\n', 'line 3\n']
Or:
>>> y = [x.rstrip() for x in open("my_file.txt")]
>>> y
['line 1','line 2','line 3']
with open('testodiprova.txt', 'r', encoding='utf-8') as file:
file = file.read().splitlines()
print(file)
with open('testodiprova.txt', 'r', encoding='utf-8') as file:
file = file.readlines()
print(file)
open('afile')# opens the file named afile in the current working directory
open('adir/afile')# relative path (relative to the current working directory)
open('C:/users/aname/afile')# absolute path (windows)
open('/usr/local/afile')# absolute path (linux)
To read a file into a list you need to do three things:
Open the file
Read the file
Store the contents as list
Fortunately Python makes it very easy to do these things so the shortest way to read a file into a list is:
lst = list(open(filename))
However I’ll add some more explanation.
Opening the file
I assume that you want to open a specific file and you don’t deal directly with a file-handle (or a file-like-handle). The most commonly used function to open a file in Python is open, it takes one mandatory argument and two optional ones in Python 2.7:
Filename
Mode
Buffering (I’ll ignore this argument in this answer)
The filename should be a string that represents the path to the file. For example:
open('afile') # opens the file named afile in the current working directory
open('adir/afile') # relative path (relative to the current working directory)
open('C:/users/aname/afile') # absolute path (windows)
open('/usr/local/afile') # absolute path (linux)
Note that the file extension needs to be specified. This is especially important for Windows users because file extensions like .txt or .doc, etc. are hidden by default when viewed in the explorer.
The second argument is the mode, it’s r by default which means “read-only”. That’s exactly what you need in your case.
For reading a file you can omit the mode or pass it in explicitly:
open(filename)
open(filename, 'r')
Both will open the file in read-only mode. In case you want to read in a binary file on Windows you need to use the mode rb:
open(filename, 'rb')
On other platforms the 'b' (binary mode) is simply ignored.
Now that I’ve shown how to open the file, let’s talk about the fact that you always need to close it again. Otherwise it will keep an open file-handle to the file until the process exits (or Python garbages the file-handle).
While you could use:
f = open(filename)
# ... do stuff with f
f.close()
That will fail to close the file when something between open and close throws an exception. You could avoid that by using a try and finally:
f = open(filename)
# nothing in between!
try:
# do stuff with f
finally:
f.close()
However Python provides context managers that have a prettier syntax (but for open it’s almost identical to the try and finally above):
with open(filename) as f:
# do stuff with f
# The file is always closed after the with-scope ends.
The last approach is the recommended approach to open a file in Python!
Reading the file
Okay, you’ve opened the file, now how to read it?
The open function returns a file object and it supports Pythons iteration protocol. Each iteration will give you a line:
with open(filename) as f:
for line in f:
print(line)
This will print each line of the file. Note however that each line will contain a newline character \n at the end (you might want to check if your Python is built with universal newlines support – otherwise you could also have \r\n on Windows or \r on Mac as newlines). If you don’t want that you can could simply remove the last character (or the last two characters on Windows):
with open(filename) as f:
for line in f:
print(line[:-1])
But the last line doesn’t necessarily has a trailing newline, so one shouldn’t use that. One could check if it ends with a trailing newline and if so remove it:
with open(filename) as f:
for line in f:
if line.endswith('\n'):
line = line[:-1]
print(line)
But you could simply remove all whitespaces (including the \n character) from the end of the string, this will also remove all other trailing whitespaces so you have to be careful if these are important:
with open(filename) as f:
for line in f:
print(f.rstrip())
However if the lines end with \r\n (Windows “newlines”) that .rstrip() will also take care of the \r!
Store the contents as list
Now that you know how to open the file and read it, it’s time to store the contents in a list. The simplest option would be to use the list function:
with open(filename) as f:
lst = list(f)
In case you want to strip the trailing newlines you could use a list comprehension instead:
with open(filename) as f:
lst = [line.rstrip() for line in f]
Or even simpler: The .readlines() method of the file object by default returns a list of the lines:
with open(filename) as f:
lst = f.readlines()
This will also include the trailing newline characters, if you don’t want them I would recommend the [line.rstrip() for line in f] approach because it avoids keeping two lists containing all the lines in memory.
There’s an additional option to get the desired output, however it’s rather “suboptimal”: read the complete file in a string and then split on newlines:
with open(filename) as f:
lst = f.read().split('\n')
or:
with open(filename) as f:
lst = f.read().splitlines()
These take care of the trailing newlines automatically because the split character isn’t included. However they are not ideal because you keep the file as string and as a list of lines in memory!
Summary
Use with open(...) as f when opening files because you don’t need to take care of closing the file yourself and it closes the file even if some exception happens.
file objects support the iteration protocol so reading a file line-by-line is as simple as for line in the_file_object:.
Always browse the documentation for the available functions/classes. Most of the time there’s a perfect match for the task or at least one or two good ones. The obvious choice in this case would be readlines() but if you want to process the lines before storing them in the list I would recommend a simple list-comprehension.
infile = open('my_file.txt','r')# Open the file for reading.
data = infile.read()# Read the contents of the file.
infile.close()# Close the file since we're done using it.
# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()
最终产品:
# Open the file for reading.with open('my_file.txt','r')as infile:
data = infile.read()# Read the contents of the file into memory.# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()
测试我们的代码:
文本文件的内容:
A fost odatã ca-n povesti,
A fost ca niciodatã,
Din rude mãri împãrãtesti,
O prea frumoasã fatã.
打印报表以进行测试:
print my_list # Print the list.# Print each line in the list.for line in my_list:print line
# Print the fourth element in this list.print my_list[3]
输出(由于Unicode字符而外观不同):
['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
frumoas\xc3\xa3 fat\xc3\xa3.']
A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
împãrãtesti, O prea frumoasã fatã.
O prea frumoasã fatã.
Clean and Pythonic Way of Reading the Lines of a File Into a List
First and foremost, you should focus on opening your file and reading its contents in an efficient and pythonic way. Here is an example of the way I personally DO NOT prefer:
infile = open('my_file.txt', 'r') # Open the file for reading.
data = infile.read() # Read the contents of the file.
infile.close() # Close the file since we're done using it.
Instead, I prefer the below method of opening files for both reading and writing as it
is very clean, and does not require an extra step of closing the file
once you are done using it. In the statement below, we’re opening the file
for reading, and assigning it to the variable ‘infile.’ Once the code within
this statement has finished running, the file will be automatically closed.
# Open the file for reading.
with open('my_file.txt', 'r') as infile:
data = infile.read() # Read the contents of the file into memory.
Now we need to focus on bringing this data into a Python List because they are iterable, efficient, and flexible. In your case, the desired goal is to bring each line of the text file into a separate element. To accomplish this, we will use the splitlines() method as follows:
# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()
The Final Product:
# Open the file for reading.
with open('my_file.txt', 'r') as infile:
data = infile.read() # Read the contents of the file into memory.
# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()
Testing Our Code:
Contents of the text file:
A fost odatã ca-n povesti,
A fost ca niciodatã,
Din rude mãri împãrãtesti,
O prea frumoasã fatã.
Print statements for testing purposes:
print my_list # Print the list.
# Print each line in the list.
for line in my_list:
print line
# Print the fourth element in this list.
print my_list[3]
Output (different-looking because of unicode characters):
['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
frumoas\xc3\xa3 fat\xc3\xa3.']
A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
împãrãtesti, O prea frumoasã fatã.
O prea frumoasã fatã.
Introduced in Python 3.4, pathlib has a really convenient method for reading in text from files, as follows:
from pathlib import Path
p = Path('my_text_file')
lines = p.read_text().splitlines()
(The splitlines call is what turns it from a string containing the whole contents of the file to a list of lines in the file).
pathlib has a lot of handy conveniences in it. read_text is nice and concise, and you don’t have to worry about opening and closing the file. If all you need to do with the file is read it all in in one go, it’s a good choice.
回答 11
通过对文件使用列表推导,这是另一个选择。
lines =[line.rstrip()for line in open('file.txt')]
Read and write text files with Python 2 and Python 3; it works with Unicode
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Define data
lines = [' A first string ',
'A Unicode sample: €',
'German: äöüß']
# Write text file
with open('file.txt', 'w') as fp:
fp.write('\n'.join(lines))
# Read text file
with open('file.txt', 'r') as fp:
read_lines = fp.readlines()
read_lines = [line.rstrip('\n') for line in read_lines]
print(lines == read_lines)
Things to notice:
with is a so-called context manager. It makes sure that the opened file is closed again.
All solutions here which simply make .strip() or .rstrip() will fail to reproduce the lines as they also strip the white space.
from pathlib importPath
file_path =Path("C:/path/file.txt")
lines = file_path.read_text().split_lines()# ... or ...
lines =[l.rstrip()for l in file_path.open()]
However, this is quite inefficient way as this will store 2 versions of the content in memory (probably not a big issue for small files, but still). [Thanks Mark Amery].
There are 2 easier ways:
Using the file as an iterator
lines = list(open('C:/path/file.txt'))
# ... or if you want to have a list without EOL characters
lines = [l.rstrip() for l in open('C:/path/file.txt')]
If you are using Python 3.4 or above, better use pathlib to create a path for your file that you could use for other operations in your program:
from pathlib import Path
file_path = Path("C:/path/file.txt")
lines = file_path.read_text().split_lines()
# ... or ...
lines = [l.rstrip() for l in file_path.open()]
回答 17
只需使用splitlines()函数。这是一个例子。
inp ="file.txt"
data = open(inp)
dat = data.read()
lst = dat.splitlines()print lst
# print(lst) # for python 3
If you want to are faced with a very large / huge file and want to read faster (imagine you are in a Topcoder/Hackerrank coding competition), you might read a considerably bigger chunk of lines into a memory buffer at one time, rather than just iterate line by line at file level.
buffersize = 2**16
with open(path) as f:
while True:
lines_buffer = f.readlines(buffersize)
if not lines_buffer:
break
for line in lines_buffer:
process(line)
The easiest ways to do that with some additional benefits are:
lines = list(open('filename'))
or
lines = tuple(open('filename'))
or
lines = set(open('filename'))
In the case with set, we must be remembered that we don’t have the line order preserved and get rid of the duplicated lines.
Below I added an important supplement from @MarkAmery:
Since you’re not calling .close on the file object nor using a with statement, in some Python implementations the file may not get closed after reading and your process will leak an open file handle.
In CPython (the normal Python implementation that most people use), this isn’t a problem since the file object will get immediately garbage-collected and this will close the file, but it’s nonetheless generally considered best practice to do something like:
with open('filename') as f: lines = list(f)
to ensure that the file gets closed regardless of what Python implementation you’re using.
回答 20
用这个:
import pandas as pd
data = pd.read_csv(filename)# You can also add parameters such as header, sep, etc.
array = data.values
With a filename, handling the file from a Path(filename) object, or directly with open(filename) as f, do one of the following:
list(fileinput.input(filename))
using with path.open() as f, call f.readlines()
list(f)
path.read_text().splitlines()
path.read_text().splitlines(keepends=True)
iterate over fileinput.input or f and list.append each line one at a time
pass f to a bound list.extend method
use f in a list comprehension
I explain the use-case for each below.
In Python, how do I read a file line-by-line?
This is an excellent question. First, let’s create some sample data:
from pathlib import Path
Path('filename').write_text('foo\nbar\nbaz')
File objects are lazy iterators, so just iterate over it.
filename = 'filename'
with open(filename) as f:
for line in f:
line # do something with the line
Alternatively, if you have multiple files, use fileinput.input, another lazy iterator. With just one file:
import fileinput
for line in fileinput.input(filename):
line # process the line
or for multiple files, pass it a list of filenames:
for line in fileinput.input([filename]*2):
line # process the line
Again, f and fileinput.input above both are/return lazy iterators.
You can only use an iterator one time, so to provide functional code while avoiding verbosity I’ll use the slightly more terse fileinput.input(filename) where apropos from here.
In Python, how do I read a file line-by-line into a list?
Ah but you want it in a list for some reason? I’d avoid that if possible. But if you insist… just pass the result of fileinput.input(filename) to list:
list(fileinput.input(filename))
Another direct answer is to call f.readlines, which returns the contents of the file (up to an optional hint number of characters, so you could break this up into multiple lists that way).
You can get to this file object two ways. One way is to pass the filename to the open builtin:
filename = 'filename'
with open(filename) as f:
f.readlines()
or using the new Path object from the pathlib module (which I have become quite fond of, and will use from here on):
from pathlib import Path
path = Path(filename)
with path.open() as f:
f.readlines()
list will also consume the file iterator and return a list – a quite direct method as well:
with path.open() as f:
list(f)
If you don’t mind reading the entire text into memory as a single string before splitting it, you can do this as a one-liner with the Path object and the splitlines() string method. By default, splitlines removes the newlines:
path.read_text().splitlines()
If you want to keep the newlines, pass keepends=True:
path.read_text().splitlines(keepends=True)
I want to read the file line by line and append each line to the end of the list.
Now this is a bit silly to ask for, given that we’ve demonstrated the end result easily with several methods. But you might need to filter or operate on the lines as you make your list, so let’s humor this request.
Using list.append would allow you to filter or operate on each line before you append it:
line_list = []
for line in fileinput.input(filename):
line_list.append(line)
line_list
Using list.extend would be a bit more direct, and perhaps useful if you have a preexisting list:
Or more idiomatically, we could instead use a list comprehension, and map and filter inside it if desirable:
[line for line in fileinput.input(filename)]
Or even more directly, to close the circle, just pass it to list to create a new list directly without operating on the lines:
list(fileinput.input(filename))
Conclusion
You’ve seen many ways to get lines from a file into a list, but I’d recommend you avoid materializing large quantities of data into a list and instead use Python’s lazy iteration to process the data if possible.
That is, prefer fileinput.input or with path.open() as f.
回答 22
如果文档中也有空行,我希望阅读内容并将其传递filter以防止空字符串元素
with open(myFile,"r")as f:
excludeFileContent = list(filter(None, f.read().splitlines()))
fpath ='dummy.txt'with open(fpath,"r")as f: lst =[line.rstrip('\n \t')for line in f]print lst
>>>['THIS IS LINE1.','THIS IS LINE2.','THIS IS LINE3.','THIS IS LINE4.']
I would try one of the below mentioned methods. The example file that I use has the name dummy.txt. You can find the file here. I presume, that the file is in the same directory as the code (you can change fpath to include the proper file name and folder path.)
In both the below mentioned examples, the list that you want is given by lst.
1.> First method:
fpath = 'dummy.txt'
with open(fpath, "r") as f: lst = [line.rstrip('\n \t') for line in f]
print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']
2.> In the second method, one can use csv.reader module from Python Standard Library:
import csv
fpath = 'dummy.txt'
with open(fpath) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=' ')
lst = [row[0] for row in csv_reader]
print lst
>>>['THIS IS LINE1.', 'THIS IS LINE2.', 'THIS IS LINE3.', 'THIS IS LINE4.']
You can use either of the two methods. Time taken for the creation of lst is almost equal in the two methods.
回答 26
这是我用来简化文件I / O 的Python(3)帮助程序库类:
import os
# handle files using a callback method, prevents repetitiondef_FileIO__file_handler(file_path, mode, callback =lambda f:None):
f = open(file_path, mode)try:return callback(f)exceptExceptionas e:raiseIOError("Failed to %s file"%["write to","read from"][mode.lower()in"r rb r+".split(" ")])finally:
f.close()classFileIO:# return the contents of a filedef read(file_path, mode ="r"):return __file_handler(file_path, mode,lambda rf: rf.read())# get the lines of a filedef lines(file_path, mode ="r", filter_fn =lambda line: len(line)>0):return[line for line inFileIO.read(file_path, mode).strip().split("\n")if filter_fn(line)]# create or update a file (NOTE: can also be used to replace a file's original content)def write(file_path, new_content, mode ="w"):return __file_handler(file_path, mode,lambda wf: wf.write(new_content))# delete a file (if it exists)def delete(file_path):return os.remove()if os.path.isfile(file_path)elseNone
然后FileIO.lines,您将使用该函数,如下所示:
file_ext_lines =FileIO.lines("./path/to/file.ext"):for i, line in enumerate(file_ext_lines):print("Line {}: {}".format(i +1, line))
Here is a Python(3) helper library class that I use to simplify file I/O:
import os
# handle files using a callback method, prevents repetition
def _FileIO__file_handler(file_path, mode, callback = lambda f: None):
f = open(file_path, mode)
try:
return callback(f)
except Exception as e:
raise IOError("Failed to %s file" % ["write to", "read from"][mode.lower() in "r rb r+".split(" ")])
finally:
f.close()
class FileIO:
# return the contents of a file
def read(file_path, mode = "r"):
return __file_handler(file_path, mode, lambda rf: rf.read())
# get the lines of a file
def lines(file_path, mode = "r", filter_fn = lambda line: len(line) > 0):
return [line for line in FileIO.read(file_path, mode).strip().split("\n") if filter_fn(line)]
# create or update a file (NOTE: can also be used to replace a file's original content)
def write(file_path, new_content, mode = "w"):
return __file_handler(file_path, mode, lambda wf: wf.write(new_content))
# delete a file (if it exists)
def delete(file_path):
return os.remove() if os.path.isfile(file_path) else None
You would then use the FileIO.lines function, like this:
file_ext_lines = FileIO.lines("./path/to/file.ext"):
for i, line in enumerate(file_ext_lines):
print("Line {}: {}".format(i + 1, line))
Remember that the mode ("r" by default) and filter_fn (checks for empty lines by default) parameters are optional.
You could even remove the read, write and delete methods and just leave the FileIO.lines, or even turn it into a separate method called read_lines.
>>> f = open('test','a+')# Not using 'with' just to simplify the example REPL session>>> f.write('hi')>>> f.seek(0)>>> f.read()'hi'>>> f.seek(0)>>> f.write('bye')# Will still append despite the seek(0)!>>> f.seek(0)>>> f.read()'hibye'
You need to open the file in append mode, by setting “a” or “ab” as the mode. See open().
When you open with “a” mode, the write position will always be at the end of the file (an append). You can open with “a+” to allow reading, seek backwards and read (but all writes will still be at the end of the file!).
Example:
>>> with open('test1','wb') as f:
f.write('test')
>>> with open('test1','ab') as f:
f.write('koko')
>>> with open('test1','rb') as f:
f.read()
'testkoko'
Note: Using ‘a’ is not the same as opening with ‘w’ and seeking to the end of the file – consider what might happen if another program opened the file and started writing between the seek and the write. On some operating systems, opening the file with ‘a’ guarantees that all your following writes will be appended atomically to the end of the file (even as the file grows by other writes).
A few more details about how the “a” mode operates (tested on Linux only). Even if you seek back, every write will append to the end of the file:
>>> f = open('test','a+') # Not using 'with' just to simplify the example REPL session
>>> f.write('hi')
>>> f.seek(0)
>>> f.read()
'hi'
>>> f.seek(0)
>>> f.write('bye') # Will still append despite the seek(0)!
>>> f.seek(0)
>>> f.read()
'hibye'
Opening a file in append mode (a as the first character of mode)
causes all subsequent write operations to this stream to occur at
end-of-file, as if preceded the call:
fseek(stream, 0, SEEK_END);
Old simplified answer (not using with):
Example: (in a real program use with to close the file – see the documentation)
with open("test.txt", "a") as myfile:
myfile.write("append me")
We declared the variable myfile to open a file named test.txt. Open takes 2 arguments, the file that we want to open and a string that represents the kinds of permission or operation we want to do on the file
here is file mode options
Mode Description
'r' This is the default mode. It Opens file for reading.
'w' This Mode Opens file for writing.
If file does not exist, it creates a new file.
If file exists it truncates the file.
'x' Creates a new file. If file already exists, the operation fails.
'a' Open file in append mode.
If file does not exist, it creates a new file.
't' This is the default mode. It opens in text mode.
'b' This opens in binary mode.
'+' This will open a file for reading and writing (updating)
The 'a' parameter signifies append mode. If you don’t want to use with open each time, you can easily write a function to do it for you:
def append(txt='\nFunction Successfully Executed', file):
with open(file, 'a') as f:
f.write(txt)
If you want to write somewhere else other than the end, you can use 'r+'†:
import os
with open(file, 'r+') as f:
f.seek(0, os.SEEK_END)
f.write("text to add")
Finally, the 'w+' parameter grants even more freedom. Specifically, it allows you to create the file if it doesn’t exist, as well as empty the contents of a file that currently exists.
The simplest way to append more text to the end of a file would be to use:
with open('/path/to/file', 'a+') as file:
file.write("Additions to file")
file.close()
The a+ in the open(...) statement instructs to open the file in append mode and allows read and write access.
It is also always good practice to use file.close() to close any files that you have opened once you are done using them.
回答 11
这是我的脚本,基本上计算行数,然后追加,然后再对它们进行计数,这样您就可以证明它起作用了。
shortPath ="../file_to_be_appended"
short = open(shortPath,'r')## this counts how many line are originally in the file:
long_path ="../file_to_be_appended_to"
long = open(long_path,'r')for i,l in enumerate(long):passprint"%s has %i lines initially"%(long_path,i)
long.close()
long = open(long_path,'a')## now open long file to append
l =True## will be a line
c =0## count the number of lines you writewhile l:try:
l = short.next()## when you run out of lines, this breaks and the except statement is run
c +=1
long.write(l)except:
l =None
long.close()print"Done!, wrote %s lines"%c
## finally, count how many lines are left.
long = open(long_path,'r')for i,l in enumerate(long):passprint"%s has %i lines after appending new lines"%(long_path, i)
long.close()
Here’s my script, which basically counts the number of lines, then appends, then counts them again so you have evidence it worked.
shortPath = "../file_to_be_appended"
short = open(shortPath, 'r')
## this counts how many line are originally in the file:
long_path = "../file_to_be_appended_to"
long = open(long_path, 'r')
for i,l in enumerate(long):
pass
print "%s has %i lines initially" %(long_path,i)
long.close()
long = open(long_path, 'a') ## now open long file to append
l = True ## will be a line
c = 0 ## count the number of lines you write
while l:
try:
l = short.next() ## when you run out of lines, this breaks and the except statement is run
c += 1
long.write(l)
except:
l = None
long.close()
print "Done!, wrote %s lines" %c
## finally, count how many lines are left.
long = open(long_path, 'r')
for i,l in enumerate(long):
pass
print "%s has %i lines after appending new lines" %(long_path, i)
long.close()
If the reason you’re checking is so you can do something like if file_exists: open_it(), it’s safer to use a try around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.
If you’re not planning to open the file immediately, you can use os.path.isfile
Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.
import os.path
os.path.isfile(fname)
if you need to be sure it’s a file.
Starting with Python 3.4, the pathlib module offers an object-oriented approach (backported to pathlib2 in Python 2.7):
from pathlib import Path
my_file = Path("/path/to/file")
if my_file.is_file():
# file exists
To check a directory, do:
if my_file.is_dir():
# directory exists
To check whether a Path object exists independently of whether is it a file or directory, use exists():
if my_file.exists():
# path exists
You can also use resolve(strict=True) in a try block:
Unlike isfile(), exists() will return True for directories. So depending on if you want only plain files or also directories, you’ll use isfile() or exists(). Here is some simple REPL output:
import os
PATH ='./file.txt'if os.path.isfile(PATH)and os.access(PATH, os.R_OK):print("File exists and is readable")else:print("Either the file is missing or not readable")
import os
PATH = './file.txt'
if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
print("File exists and is readable")
else:
print("Either the file is missing or not readable")
回答 5
import os
os.path.exists(path)# Returns whether the path (directory or file) exists or not
os.path.isfile(path)# Returns whether the file exists or not
import os
os.path.exists(path) # Returns whether the path (directory or file) exists or not
os.path.isfile(path) # Returns whether the file exists or not
def is_file(self):"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""try:return S_ISREG(self.stat().st_mode)exceptOSErroras e:if e.errno notin(ENOENT, ENOTDIR):raise# Path doesn't exist or is a broken symlink# (see https://bitbucket.org/pitrou/pathlib/issue/12/)returnFalse
classSwallow:# Dummy example
swallowed_exceptions =(FileNotFoundError,)def __enter__(self):print("Entering...")def __exit__(self, exc_type, exc_value, exc_traceback):print("Exiting:", exc_type, exc_value, exc_traceback)return exc_type inSwallow.swallowed_exceptions # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
Python3.5.2(default,Nov172016,17:05:23)[GCC 5.4.020160609] on linux
Type"help","copyright","credits"or"license"for more information.>>>import os, ctypes
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)0>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)-1
Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I’ll try to group everything together.
Note: every piece of Python standard library code that I’m going to post, belongs to version 3.5.3.
Problem statement:
Check file (arguable: also folder (“special” file) ?) existence
Don’t use try / except / else / finally blocks
Possible solutions:
[Python 3]: os.path.exists(path) (also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors)
os.path.exists(path)
Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.
All good, but if following the import tree:
os.path – posixpath.py (ntpath.py)
genericpath.py, line ~#20+
def exists(path):
"""Test whether a path exists. Returns False for broken symbolic links"""
try:
st = os.stat(path)
except os.error:
return False
return True
it’s just a try / except block around [Python 3]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there’s (at least) one such block. This also applies to other funcs (includingos.path.isfile).
It’s a fancier (and more pythonic) way of handling paths, but
Under the hood, it does exactly the same thing (pathlib.py, line ~#1330):
def is_file(self):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
try:
return S_ISREG(self.stat().st_mode)
except OSError as e:
if e.errno not in (ENOENT, ENOTDIR):
raise
# Path doesn't exist or is a broken symlink
# (see https://bitbucket.org/pitrou/pathlib/issue/12/)
return False
class Swallow: # Dummy example
swallowed_exceptions = (FileNotFoundError,)
def __enter__(self):
print("Entering...")
def __exit__(self, exc_type, exc_value, exc_traceback):
print("Exiting:", exc_type, exc_value, exc_traceback)
return exc_type in Swallow.swallowed_exceptions # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
And its usage – I’ll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):
import os
import stat
def isfile_seaman(path): # Dummy func
result = False
with Swallow():
result = stat.S_ISREG(os.stat(path).st_mode)
return result
Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.
Doesn’t seem a traversing function per se (at least in some cases), but it still uses os.listdir
Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing – as @ShadowRanger pointed out), so I’m not going to insist on them. Not to mention that in some cases, filename processing might be required.
user permissions might restrict the file “visibility” as the doc states:
…test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path…
os.access("/tmp", os.F_OK)
Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via “${PYTHON_SRC_DIR}/Modules/posixmodule.c”), but it also opens a gate for possible user errors, and it’s not as Pythonic as other variants. So, as @AaronHall rightly pointed out, don’t use it unless you know what you’re doing:
Nix: [man7]: ACCESS(2) (!!! pay attention to the note about the security hole its usage might introduce !!!)
(Win specific): Since vcruntime* (msvcr*) .dll exports a [MS.Docs]: _access, _waccess function family as well, here’s an example:
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, ctypes
>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe", os.F_OK)
0
>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe.notexist", os.F_OK)
-1
Notes:
Although it’s not a good practice, I’m using os.F_OK in the call, but that’s just for clarity (its value is 0)
I’m using _waccess so that the same code works on Python3 and Python2 (in spite of unicode related differences between them)
Although this targets a very specific area, it was not mentioned in any of the previous answers
The Lnx (Ubtu (16 x64)) counterpart as well:
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, ctypes
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
0
>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
-1
Notes:
Instead hardcoding libc‘s path (“/lib/x86_64-linux-gnu/libc.so.6”) which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [man7]: DLOPEN(3):
If filename is NULL, then the returned handle is for the main
program. When given to dlsym(), this handle causes a search for a
symbol in the main program, followed by all shared objects loaded at
program startup, and then all shared objects loaded by dlopen() with
the flag RTLD_GLOBAL.
Main (current) program (python) is linked against libc, so its symbols (including access) will be loaded
This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
This doesn’t also apply to Win (but that’s not such a big deal, since msvcrt.dll is located in “%SystemRoot%\System32” which is in %PATH% by default). I wanted to take things further and replicate this behavior on Win (and submit a patch), but as it turns out, [MS.Docs]: GetProcAddress function only “sees” exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the regular person would do that?), the main program is loadable but pretty much unusable
Install some third-party module with filesystem capabilities
Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems. A counter-example that I can think of, is performance: such blocks are costly, so try not to place them in code that it’s supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won’t be the case).
Final note(s):
I will try to keep it up to date, any suggestions are welcome, I will incorporate anything useful that will come up into the answer
This is the simplest way to check if a file exists. Just because the file existed when you checked doesn’t guarantee that it will be there when you need to open it.
import os
fname = "foo.txt"
if os.path.isfile(fname):
print("file does exist at this time")
else:
print("no such file exists at this time")
Python 3.4+ has an object-oriented path module: pathlib. Using this new module, you can check whether a file exists like this:
import pathlib
p = pathlib.Path('path/to/file')
if p.is_file(): # or p.is_dir() to see if it is a directory
# do stuff
You can (and usually should) still use a try/except block when opening files:
try:
with p.open() as f:
# do awesome stuff
except OSError:
print('Well darn.')
The pathlib module has lots of cool stuff in it: convenient globbing, checking file’s owner, easier path joining, etc. It’s worth checking out. If you’re on an older Python (version 2.6 or later), you can still install pathlib with pip:
# installs pathlib2 on older Python versions
# the original third-party module, pathlib, is no longer maintained.
pip install pathlib2
Then import it as follows:
# Older Python versions
import pathlib2 as pathlib
def is_file(self):"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""try:return S_ISREG(self.stat().st_mode)exceptOSErroras e:if e.errno notin(ENOENT, ENOTDIR):raise# Path doesn't exist or is a broken symlink# (see https://bitbucket.org/pitrou/pathlib/issue/12/)returnFalse
from contextlib import suppress
from pathlib importPath
用法:
>>>with suppress(OSError),Path('doesnotexist').open()as f:...for line in f:...print(line)...>>>>>>with suppress(OSError):...Path('doesnotexist').unlink()...>>>
# This follows symbolic links, so both islink() and isdir() can be true# for the same path on systems that support symlinksdef isfile(path):"""Test whether a path is a regular file"""try:
st = os.stat(path)except os.error:returnFalsereturn stat.S_ISREG(st.st_mode)
How do I check whether a file exists, using Python, without using a try statement?
Now available since Python 3.4, import and instantiate a Path object with the file name, and check the is_file method (note that this returns True for symlinks pointing to regular files as well):
Now the above is probably the best pragmatic direct answer here, but there’s the possibility of a race condition (depending on what you’re trying to accomplish), and the fact that the underlying implementation uses a try, but Python uses try everywhere in its implementation.
Because Python uses try everywhere, there’s really no reason to avoid an implementation that uses it.
But the rest of this answer attempts to consider these caveats.
Longer, much more pedantic answer
Available since Python 3.4, use the new Path object in pathlib. Note that .exists is not quite right, because directories are not files (except in the unix sense that everything is a file).
By default, NamedTemporaryFile deletes the file when closed (and will automatically close when no more references exist to it).
>>> del file
>>> filepathobj.exists()
False
>>> filepathobj.is_file()
False
If you dig into the implementation, though, you’ll see that is_file uses try:
def is_file(self):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
try:
return S_ISREG(self.stat().st_mode)
except OSError as e:
if e.errno not in (ENOENT, ENOTDIR):
raise
# Path doesn't exist or is a broken symlink
# (see https://bitbucket.org/pitrou/pathlib/issue/12/)
return False
Race Conditions: Why we like try
We like try because it avoids race conditions. With try, you simply attempt to read your file, expecting it to be there, and if not, you catch the exception and perform whatever fallback behavior makes sense.
If you want to check that a file exists before you attempt to read it, and you might be deleting it and then you might be using multiple threads or processes, or another program knows about that file and could delete it – you risk the chance of a race condition if you check it exists, because you are then racing to open it before its condition (its existence) changes.
Race conditions are very hard to debug because there’s a very small window in which they can cause your program to fail.
But if this is your motivation, you can get the value of a try statement by using the suppress context manager.
Avoiding race conditions without a try statement: suppress
Python 3.4 gives us the suppress context manager (previously the ignore context manager), which does semantically exactly the same thing in fewer lines, while also (at least superficially) meeting the original ask to avoid a try statement:
from contextlib import suppress
from pathlib import Path
Usage:
>>> with suppress(OSError), Path('doesnotexist').open() as f:
... for line in f:
... print(line)
...
>>>
>>> with suppress(OSError):
... Path('doesnotexist').unlink()
...
>>>
For earlier Pythons, you could roll your own suppress, but without a try will be more verbose than with. I do believe this actually is the only answer that doesn’t use try at any level in the Python that can be applied to prior to Python 3.4 because it uses a context manager instead:
class suppress(object):
def __init__(self, *exceptions):
self.exceptions = exceptions
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
if exc_type is not None:
return issubclass(exc_type, self.exceptions)
Return True if path is an existing regular file. This follows symbolic
links, so both islink() and isfile() can be true for the same path.
But if you examine the source of this function, you’ll see it actually does use a try statement:
# This follows symbolic links, so both islink() and isdir() can be true
# for the same path on systems that support symlinks
def isfile(path):
"""Test whether a path is a regular file"""
try:
st = os.stat(path)
except os.error:
return False
return stat.S_ISREG(st.st_mode)
>>> OSError is os.error
True
All it’s doing is using the given path to see if it can get stats on it, catching OSError and then checking if it’s a file if it didn’t raise the exception.
If you intend to do something with the file, I would suggest directly attempting it with a try-except to avoid a race condition:
try:
with open(path) as f:
f.read()
except OSError:
pass
os.access
Available for Unix and Windows is os.access, but to use you must pass flags, and it does not differentiate between files and directories. This is more used to test if the real invoking user has access in an elevated privilege environment:
import os
os.access(path, os.F_OK)
It also suffers from the same race condition problems as isfile. From the docs:
Note:
Using access() to check if a user is authorized to e.g. open a file
before actually doing so using open() creates a security hole, because
the user might exploit the short time interval between checking and
opening the file to manipulate it. It’s preferable to use EAFP
techniques. For example:
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
return "some default data"
is better written as:
try:
fp = open("myfile")
except IOError as e:
if e.errno == errno.EACCES:
return "some default data"
# Not a permission error.
raise
else:
with fp:
return fp.read()
Avoid using os.access. It is a low level function that has more opportunities for user error than the higher level objects and functions discussed above.
Criticism of another answer:
Another answer says this about os.access:
Personally, I prefer this one because under the hood, it calls native APIs (via “${PYTHON_SRC_DIR}/Modules/posixmodule.c”), but it also opens a gate for possible user errors, and it’s not as Pythonic as other variants:
This answer says it prefers a non-Pythonic, error-prone method, with no justification. It seems to encourage users to use low-level APIs without understanding them.
It also creates a context manager which, by unconditionally returning True, allows all Exceptions (including KeyboardInterrupt and SystemExit!) to pass silently, which is a good way to hide bugs.
This seems to encourage users to adopt poor practices.
回答 11
import os
#Your path here e.g. "C:\Program Files\text.txt"#For access purposes: "C:\\Program Files\\text.txt"if os.path.exists("C:\..."):print"File found!"else:print"File not found!"
import os
#Your path here e.g. "C:\Program Files\text.txt"
#For access purposes: "C:\\Program Files\\text.txt"
if os.path.exists("C:\..."):
print "File found!"
else:
print "File not found!"
Importing os makes it easier to navigate and perform standard actions with your operating system.
Testing for files and folders with os.path.isfile(), os.path.isdir() and os.path.exists()
Assuming that the “path” is a valid path, this table shows what is returned by each function for files and folders:
You can also test if a file is a certain type of file using os.path.splitext() to get the extension (if you don’t already know it)
>>> import os
>>> path = "path to a word document"
>>> os.path.isfile(path)
True
>>> os.path.splitext(path)[1] == ".docx" # test if the extension is .docx
True
try:
f = open(filepath)exceptIOError:print'Oh dear.'
但是,如果您只是想重命名文件(如果存在),因此不需要打开它,请执行
if os.path.isfile(filepath):
os.rename(filepath, filepath +'.old')
如果要写入文件(如果不存在),请执行
# python 2ifnot os.path.isfile(filepath):
f = open(filepath,'w')# python 3, x opens for exclusive creation, failing if the file already existstry:
f = open(filepath,'wx')exceptIOError:print'file already exists'
It doesn’t seem like there’s a meaningful functional difference between try/except and isfile(), so you should use which one makes sense.
If you want to read a file, if it exists, do
try:
f = open(filepath)
except IOError:
print 'Oh dear.'
But if you just wanted to rename a file if it exists, and therefore don’t need to open it, do
if os.path.isfile(filepath):
os.rename(filepath, filepath + '.old')
If you want to write to a file, if it doesn’t exist, do
# python 2
if not os.path.isfile(filepath):
f = open(filepath, 'w')
# python 3, x opens for exclusive creation, failing if the file already exists
try:
f = open(filepath, 'wx')
except IOError:
print 'file already exists'
If you need file locking, that’s a different matter.
回答 15
您可以尝试这样做(更安全):
try:# http://effbot.org/zone/python-with-statement.htm# 'with' is safer to open a filewith open('whatever.txt')as fh:# Do something with 'fh'exceptIOErroras e:print("({})".format(e))
try:
# http://effbot.org/zone/python-with-statement.htm
# 'with' is safer to open a file
with open('whatever.txt') as fh:
# Do something with 'fh'
except IOError as e:
print("({})".format(e))
The ouput would be:
([Errno 2] No such file or directory:
‘whatever.txt’)
Then, depending on the result, your program can just keep running from there or you can code to stop it if you want.
>>>import os
>>> os.access("/is/a/file.txt", os.F_OK)True
我还应该提到,有两种方法将使您无法验证文件的存在。问题将是permission denied或no such file or directory。如果您发现IOError,请设置IOError as e(像我的第一个选项一样),然后键入print(e.args)以便希望确定问题。希望对您有所帮助!:)
Although I always recommend using try and except statements, here are a few possibilities for you (my personal favourite is using os.access):
Try opening the file:
Opening the file will always verify the existence of the file. You can make a function just like so:
def File_Existence(filepath):
f = open(filepath)
return True
If it’s False, it will stop execution with an unhanded IOError
or OSError in later versions of Python. To catch the exception,
you have to use a try except clause. Of course, you can always
use a try except` statement like so (thanks to hsandt
for making me think):
def File_Existence(filepath):
try:
f = open(filepath)
except IOError, OSError: # Note OSError is for later versions of Python
return False
return True
Use os.path.exists(path):
This will check the existence of what you specify. However, it checks for files and directories so beware about how you use it.
This will check whether you have access to the file. It will check for permissions. Based on the os.py documentation, typing in os.F_OK, it will check the existence of the path. However, using this will create a security hole, as someone can attack your file using the time between checking the permissions and opening the file. You should instead go directly to opening the file instead of checking its permissions. (EAFP vs LBYP). If you’re not going to open the file afterwards, and only checking its existence, then you can use this.
Anyway, here:
>>> import os
>>> os.access("/is/a/file.txt", os.F_OK)
True
I should also mention that there are two ways that you will not be able to verify the existence of a file. Either the issue will be permission denied or no such file or directory. If you catch an IOError, set the IOError as e (like my first option), and then type in print(e.args) so that you can hopefully determine your issue. I hope it helps! :)
回答 17
日期:2017-12-04
每种可能的解决方案都已在其他答案中列出。
一种检查文件是否存在的直观且可参数的方法如下:
import os
os.path.isfile('~/file.md')# Returns True if exists, else False# additionaly check a dir
os.path.isdir('~/folder')# Returns True if the folder exists, else False# check either a dir or a file
os.path.exists('~/file')
我做了详尽的备忘单供您参考:
#os.path methods in exhaustive cheatsheet{'definition':['dirname','basename','abspath','relpath','commonpath','normpath','realpath'],'operation':['split','splitdrive','splitext','join','normcase'],'compare':['samefile','sameopenfile','samestat'],'condition':['isdir','isfile','exists','lexists''islink','isabs','ismount',],'expand':['expanduser','expandvars'],'stat':['getatime','getctime','getmtime','getsize']}
Every possible solution has been listed in other answers.
An intuitive and arguable way to check if a file exists is the following:
import os
os.path.isfile('~/file.md') # Returns True if exists, else False
# additionaly check a dir
os.path.isdir('~/folder') # Returns True if the folder exists, else False
# check either a dir or a file
os.path.exists('~/file')
I made an exhaustive cheatsheet for your reference:
with open('somefile','xt')as f:#Using the x-flag, Python3.3 and above
f.write('Hello\n')ifnot os.path.exists('somefile'):with open('somefile','wt')as f:
f.write("Hello\n")else:print('File already exists!')
If the file is for opening you could use one of the following techniques:
with open('somefile', 'xt') as f: #Using the x-flag, Python3.3 and above
f.write('Hello\n')
if not os.path.exists('somefile'):
with open('somefile', 'wt') as f:
f.write("Hello\n")
else:
print('File already exists!')
UPDATE
Just to avoid confusion and based on the answers I got, current answer finds either a file or a directory with the given name.
回答 19
另外,os.access():
if os.access("myfile", os.R_OK):with open("myfile")as fp:return fp.read()
if os.path.isfile(path_to_file):
try:
open(path_to_file)
pass
except IOError as e:
print "Unable to open file"
Raising exceptions is considered to be an acceptable, and Pythonic,
approach for flow control in your program. Consider handling missing
files with IOErrors. In this situation, an IOError exception will be
raised if the file exists but the user does not have read permissions.
>>>print pox.find.__doc__
find(patterns[,root,recurse,type]);Get path to a file or directory
patterns: name or partial name string of items to search for
root: path string of top-level directory to search
recurse:ifTrue, recurse down from root directory
type: item filter; one of {None, file, dir, link, socket, block, char}
verbose:ifTrue, be a little verbose about the search
On some OS, recursion can be specified by recursion depth (an integer).
patterns can be specified with basic pattern matching.Additionally,
multiple patterns can be specified by splitting patterns with a ';'For example:>>> find('pox*', root='..')['/Users/foo/pox/pox','/Users/foo/pox/scripts/pox_launcher.py']>>> find('*shutils*;*init*')['/Users/foo/pox/pox/shutils.py','/Users/foo/pox/pox/__init__.py']>>>
I’m the author of a package that’s been around for about 10 years, and it has a function that addresses this question directly. Basically, if you are on a non-Windows system, it uses Popen to access find. However, if you are on Windows, it replicates find with an efficient filesystem walker.
The code itself does not use a try block… except in determining the operating system and thus steering you to the “Unix”-style find or the hand-buillt find. Timing tests showed that the try was faster in determining the OS, so I did use one there (but nowhere else).
>>> print pox.find.__doc__
find(patterns[,root,recurse,type]); Get path to a file or directory
patterns: name or partial name string of items to search for
root: path string of top-level directory to search
recurse: if True, recurse down from root directory
type: item filter; one of {None, file, dir, link, socket, block, char}
verbose: if True, be a little verbose about the search
On some OS, recursion can be specified by recursion depth (an integer).
patterns can be specified with basic pattern matching. Additionally,
multiple patterns can be specified by splitting patterns with a ';'
For example:
>>> find('pox*', root='..')
['/Users/foo/pox/pox', '/Users/foo/pox/scripts/pox_launcher.py']
>>> find('*shutils*;*init*')
['/Users/foo/pox/pox/shutils.py', '/Users/foo/pox/pox/__init__.py']
>>>
How do I check whether a file exists, without using the try statement?
In 2016, this is still arguably the easiest way to check if both a file exists and if it is a file:
import os
os.path.isfile('./file.txt') # Returns True if exists, else False
isfile is actually just a helper method that internally uses os.stat and stat.S_ISREG(mode) underneath. This os.stat is a lower-level method that will provide you with detailed information about files, directories, sockets, buffers, and more. More about os.stat here
Note: However, this approach will not lock the file in any way and therefore your code can become vulnerable to “time of check to time of use” (TOCTTOU) bugs.
So raising exceptions is considered to be an acceptable, and Pythonic, approach for flow control in your program. And one should consider handling missing files with IOErrors, rather than if statements (just an advice).
回答 29
import os.path
def isReadableFile(file_path, file_name):
full_path = file_path +"/"+ file_name
try:ifnot os.path.exists(file_path):print"File path is invalid."returnFalseelifnot os.path.isfile(full_path):print"File does not exist."returnFalseelifnot os.access(full_path, os.R_OK):print"File cannot be read."returnFalseelse:print"File can be read."returnTrueexceptIOErroras ex:print"I/O error({0}): {1}".format(ex.errno, ex.strerror)exceptErroras ex:print"Error({0}): {1}".format(ex.errno, ex.strerror)returnFalse#------------------------------------------------------
path ="/usr/khaled/documents/puzzles"
fileName ="puzzle_1.txt"
isReadableFile(path, fileName)