问题:如何使用python获取文件夹中的最新文件
我需要使用python获取文件夹的最新文件。使用代码时:
max(files, key = os.path.getctime)
我收到以下错误:
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'
I need to get the latest file of a folder using python. While using the code:
max(files, key = os.path.getctime)
I am getting the below error:
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'
回答 0
分配给files
变量的任何内容均不正确。使用以下代码。
import glob
import os
list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print latest_file
Whatever is assigned to the files
variable is incorrect. Use the following code.
import glob
import os
list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print latest_file
回答 1
max(files, key = os.path.getctime)
是非常不完整的代码。什么files
啊 可能是来自的文件名列表os.listdir()
。
但是此列表仅列出了文件名部分(也称为“基本名称”),因为它们的路径是通用的。为了正确使用它,您必须将其与通向它的路径结合起来(并用于获得它)。
如(未测试):
def newest(path):
files = os.listdir(path)
paths = [os.path.join(path, basename) for basename in files]
return max(paths, key=os.path.getctime)
max(files, key = os.path.getctime)
is quite incomplete code. What is files
? It probably is a list of file names, coming out of os.listdir()
.
But this list lists only the filename parts (a. k. a. “basenames”), because their path is common. In order to use it correctly, you have to combine it with the path leading to it (and used to obtain it).
Such as (untested):
def newest(path):
files = os.listdir(path)
paths = [os.path.join(path, basename) for basename in files]
return max(paths, key=os.path.getctime)
回答 2
我建议使用glob.iglob()
代替glob.glob()
,因为它效率更高。
glob.iglob()返回一个迭代器,该迭代器产生的值与glob()相同,而实际上并没有同时存储所有值。
意思是 glob.iglob()
效率更高。
我主要使用以下代码查找与我的模式匹配的最新文件:
LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)
注意:max
函数有多种变体,如果找到最新文件,我们将使用以下变体:
max(iterable, *[, key, default])
它需要迭代,因此您的第一个参数应该是可迭代的。如果找到最大数量,我们可以使用beow变体:max (num1, num2, num3, *args[, key])
I would suggest using glob.iglob()
instead of the glob.glob()
, as it is more efficient.
glob.iglob() Return an iterator which yields the same values as glob() without actually storing them all simultaneously.
Which means glob.iglob()
will be more efficient.
I mostly use below code to find the latest file matching to my pattern:
LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)
NOTE:
There are variants of max
function, In case of finding the latest file we will be using below variant:
max(iterable, *[, key, default])
which needs iterable so your first parameter should be iterable.
In case of finding max of nums we can use beow variant : max (num1, num2, num3, *args[, key])
回答 3
尝试按创建时间对项目排序。以下示例对文件夹中的文件进行排序,并获取最新的第一个元素。
import glob
import os
files_path = os.path.join(folder, '*')
files = sorted(
glob.iglob(files_path), key=os.path.getctime, reverse=True)
print files[0]
Try to sort items by creation time. Example below sorts files in a folder and gets first element which is latest.
import glob
import os
files_path = os.path.join(folder, '*')
files = sorted(
glob.iglob(files_path), key=os.path.getctime, reverse=True)
print files[0]
回答 4
我缺乏发表评论的声誉,但是Marlon Abeykoons回应的ctime并未为我提供正确的结果。使用mtime可以解决问题。(key = os.path.get m时间))
import glob
import os
list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getmtime)
print latest_file
对于该问题,我找到了两个答案:
python os.path.getctime max不返回最新的
python-unix系统中的getmtime()和getctime()之间的区别
回答 5
(编辑以改善答案)
首先定义一个函数get_latest_file
def get_latest_file(path, *paths):
fullpath = os.path.join(path, paths)
...
get_latest_file('example', 'files','randomtext011.*.txt')
您也可以使用文档字符串!
def get_latest_file(path, *paths):
"""Returns the name of the latest (most recent) file
of the joined path(s)"""
fullpath = os.path.join(path, *paths)
如果使用Python 3,则可以改用iglob。
完成代码以返回最新文件的名称:
def get_latest_file(path, *paths):
"""Returns the name of the latest (most recent) file
of the joined path(s)"""
fullpath = os.path.join(path, *paths)
files = glob.glob(fullpath) # You may use iglob in Python3
if not files: # I prefer using the negation
return None # because it behaves like a shortcut
latest_file = max(files, key=os.path.getctime)
_, filename = os.path.split(latest_file)
return filename
(Edited to improve answer)
First define a function get_latest_file
def get_latest_file(path, *paths):
fullpath = os.path.join(path, paths)
...
get_latest_file('example', 'files','randomtext011.*.txt')
You may also use a docstring !
def get_latest_file(path, *paths):
"""Returns the name of the latest (most recent) file
of the joined path(s)"""
fullpath = os.path.join(path, *paths)
If you use Python 3, you can use iglob instead.
Complete code to return the name of latest file:
def get_latest_file(path, *paths):
"""Returns the name of the latest (most recent) file
of the joined path(s)"""
fullpath = os.path.join(path, *paths)
files = glob.glob(fullpath) # You may use iglob in Python3
if not files: # I prefer using the negation
return None # because it behaves like a shortcut
latest_file = max(files, key=os.path.getctime)
_, filename = os.path.split(latest_file)
return filename
回答 6
我试图使用以上建议,但程序崩溃了,而不是我想识别的文件已被使用,并且在尝试使用“ os.path.getctime”时崩溃了。最终对我有用的是:
files_before = glob.glob(os.path.join(my_path,'*'))
**code where new file is created**
new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))
此代码获取了两组文件列表之间最常见的对象,它并不是最优雅的对象,如果同时创建多个文件,则可能会不稳定
I have tried to use the above suggestions and my program crashed, than I figured out the file I’m trying to identify was used and when trying to use ‘os.path.getctime’ it crashed.
what finally worked for me was:
files_before = glob.glob(os.path.join(my_path,'*'))
**code where new file is created**
new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))
this codes gets the uncommon object between the two sets of file lists
its not the most elegant, and if multiple files are created at the same time it would probably won’t be stable
回答 7
在Windows(0.05s)上更快的方法是,调用执行此操作的bat脚本:
get_latest.bat
@echo off
for /f %%i in ('dir \\directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%
\\directory\in\question
您要调查的目录在哪里。
get_latest.py
from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)
如果找到文件stdout
是路径,stderr
则为None。
使用stdout.decode("utf-8").rstrip()
来获取文件名使用字符串表示。
A much faster method on windows (0.05s), call a bat script that does this:
get_latest.bat
@echo off
for /f %%i in ('dir \\directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%
where \\directory\in\question
is the directory you want to investigate.
get_latest.py
from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)
if it finds a file stdout
is the path and stderr
is None.
Use stdout.decode("utf-8").rstrip()
to get the usable string representation of the file name.
回答 8
我在Python 3中一直在使用它,包括在文件名上进行模式匹配。
from pathlib import Path
def latest_file(path: Path, pattern: str = "*"):
files = path.glob(pattern)
return max(files, key=lambda x: x.stat().st_ctime)
I’ve been using this in Python 3, including pattern matching on the filename.
from pathlib import Path
def latest_file(path: Path, pattern: str = "*"):
files = path.glob(pattern)
return max(files, key=lambda x: x.stat().st_ctime)