问题:Python列表目录,子目录和文件
我试图制作一个脚本来列出给定目录中的所有目录,子目录和文件。
我尝试了这个:
import sys,os
root = "/home/patate/directory/"
path = os.path.join(root, "targetdirectory")
for r,d,f in os.walk(path):
for file in f:
print os.path.join(root,file)
不幸的是,它无法正常工作。
我得到所有文件,但没有完整的路径。
例如,如果dir结构为:
/home/patate/directory/targetdirectory/123/456/789/file.txt
它会打印:
/home/patate/directory/targetdirectory/file.txt
我需要的是第一个结果。任何帮助将不胜感激!谢谢。
I’m trying to make a script to list all directory, subdirectory, and files in a given directory.
I tried this:
import sys,os
root = "/home/patate/directory/"
path = os.path.join(root, "targetdirectory")
for r,d,f in os.walk(path):
for file in f:
print os.path.join(root,file)
Unfortunatly it doesn’t work properly.
I get all the files, but not their complete paths.
For example if the dir struct would be:
/home/patate/directory/targetdirectory/123/456/789/file.txt
It would print:
/home/patate/directory/targetdirectory/file.txt
What I need is the first result. Any help would be greatly appreciated! Thanks.
回答 0
使用os.path.join
来连接的目录和文件名:
for path, subdirs, files in os.walk(root):
for name in files:
print os.path.join(path, name)
请注意,path
并没有root
在串联中使用,因为使用root
会不正确。
在Python 3.4中,添加了pathlib模块以简化路径操作。因此,等效于os.path.join
:
pathlib.PurePath(path, name)
好处pathlib
是您可以在路径上使用各种有用的方法。如果使用具体的Path
变体,您还可以通过它们进行实际的OS调用,例如,进入目录,删除路径,打开其指向的文件等等。
Use os.path.join
to concatenate the directory and file name:
for path, subdirs, files in os.walk(root):
for name in files:
print os.path.join(path, name)
Note the usage of path
and not root
in the concatenation, since using root
would be incorrect.
In Python 3.4, the pathlib module was added for easier path manipulations. So the equivalent to os.path.join
would be:
pathlib.PurePath(path, name)
The advantage of pathlib
is that you can use a variety of useful methods on paths. If you use the concrete Path
variant you can also do actual OS calls through them, like changing into a directory, deleting the path, opening the file it points to and much more.
回答 1
以防万一…获取目录和子目录中与某个模式匹配的所有文件(例如* .py):
import os
from fnmatch import fnmatch
root = '/some/directory'
pattern = "*.py"
for path, subdirs, files in os.walk(root):
for name in files:
if fnmatch(name, pattern):
print os.path.join(path, name)
Just in case… Getting all files in the directory and subdirectories matching some pattern (*.py for example):
import os
from fnmatch import fnmatch
root = '/some/directory'
pattern = "*.py"
for path, subdirs, files in os.walk(root):
for name in files:
if fnmatch(name, pattern):
print os.path.join(path, name)
回答 2
这里是单线:
import os
[val for sublist in [[os.path.join(i[0], j) for j in i[2]] for i in os.walk('./')] for val in sublist]
# Meta comment to ease selecting text
最外面的val for sublist in ...
循环将列表平整为一维。该j
循环收集每个文件名前缀的列表,并将其加入到当前的路径。最后,i
循环遍历所有目录和子目录。
本示例./
在os.walk(...)
调用中使用了硬编码的路径,您可以补充所需的任何路径字符串。
注意:os.path.expanduser
和/或os.path.expandvars
可用于路径字符串,例如~/
扩展此示例:
它易于添加文件基本名测试和目录名测试。
例如,测试*.jpg
文件:
... for j in i[2] if j.endswith('.jpg')] ...
此外,排除.git
目录:
... for i in os.walk('./') if '.git' not in i[0].split('/')]
Here is a one-liner:
import os
[val for sublist in [[os.path.join(i[0], j) for j in i[2]] for i in os.walk('./')] for val in sublist]
# Meta comment to ease selecting text
The outer most val for sublist in ...
loop flattens the list to be one dimensional. The j
loop collects a list of every file basename and joins it to the current path. Finally, the i
loop iterates over all directories and sub directories.
This example uses the hard-coded path ./
in the os.walk(...)
call, you can supplement any path string you like.
Note: os.path.expanduser
and/or os.path.expandvars
can be used for paths strings like ~/
Extending this example:
Its easy to add in file basename tests and directoryname tests.
For Example, testing for *.jpg
files:
... for j in i[2] if j.endswith('.jpg')] ...
Additionally, excluding the .git
directory:
... for i in os.walk('./') if '.git' not in i[0].split('/')]
回答 3
无法发表评论,请在此处写答案。这是我所看到的最清晰的一行:
import os
[os.path.join(path, name) for path, subdirs, files in os.walk(root) for name in files]
Couldn’t comment so writing answer here. This is the clearest one-line I have seen:
import os
[os.path.join(path, name) for path, subdirs, files in os.walk(root) for name in files]
回答 4
您可以看一下我制作的这个样本。它使用了os.path.walk函数,该函数已被提倡,使用列表存储所有文件路径
root = "Your root directory"
ex = ".txt"
where_to = "Wherever you wanna write your file to"
def fileWalker(ext,dirname,names):
'''
checks files in names'''
pat = "*" + ext[0]
for f in names:
if fnmatch.fnmatch(f,pat):
ext[1].append(os.path.join(dirname,f))
def writeTo(fList):
with open(where_to,"w") as f:
for di_r in fList:
f.write(di_r + "\n")
if __name__ == '__main__':
li = []
os.path.walk(root,fileWalker,[ex,li])
writeTo(li)
You can take a look at this sample I made. It uses the os.path.walk function which is deprecated beware.Uses a list to store all the filepaths
root = "Your root directory"
ex = ".txt"
where_to = "Wherever you wanna write your file to"
def fileWalker(ext,dirname,names):
'''
checks files in names'''
pat = "*" + ext[0]
for f in names:
if fnmatch.fnmatch(f,pat):
ext[1].append(os.path.join(dirname,f))
def writeTo(fList):
with open(where_to,"w") as f:
for di_r in fList:
f.write(di_r + "\n")
if __name__ == '__main__':
li = []
os.path.walk(root,fileWalker,[ex,li])
writeTo(li)
回答 5
一线简单一点:
import os
from itertools import product, chain
chain.from_iterable([[os.sep.join(w) for w in product([i[0]], i[2])] for i in os.walk(dir)])
A bit simpler one-liner:
import os
from itertools import product, chain
chain.from_iterable([[os.sep.join(w) for w in product([i[0]], i[2])] for i in os.walk(dir)])
回答 6
由于这里的每个示例都仅使用walk
(with join
),因此我想展示一个不错的示例并与进行比较listdir
:
import os, time
def listFiles1(root): # listdir
allFiles = []; walk = [root]
while walk:
folder = walk.pop(0)+"/"; items = os.listdir(folder) # items = folders + files
for i in items: i=folder+i; (walk if os.path.isdir(i) else allFiles).append(i)
return allFiles
def listFiles2(root): # listdir/join (takes ~1.4x as long) (and uses '\\' instead)
allFiles = []; walk = [root]
while walk:
folder = walk.pop(0); items = os.listdir(folder) # items = folders + files
for i in items: i=os.path.join(folder,i); (walk if os.path.isdir(i) else allFiles).append(i)
return allFiles
def listFiles3(root): # walk (takes ~1.5x as long)
allFiles = []
for folder, folders, files in os.walk(root):
for file in files: allFiles+=[folder.replace("\\","/")+"/"+file] # folder+"\\"+file still ~1.5x
return allFiles
def listFiles4(root): # walk/join (takes ~1.6x as long) (and uses '\\' instead)
allFiles = []
for folder, folders, files in os.walk(root):
for file in files: allFiles+=[os.path.join(folder,file)]
return allFiles
for i in range(100): files = listFiles1("src") # warm up
start = time.time()
for i in range(100): files = listFiles1("src") # listdir
print("Time taken: %.2fs"%(time.time()-start)) # 0.28s
start = time.time()
for i in range(100): files = listFiles2("src") # listdir and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.38s
start = time.time()
for i in range(100): files = listFiles3("src") # walk
print("Time taken: %.2fs"%(time.time()-start)) # 0.42s
start = time.time()
for i in range(100): files = listFiles4("src") # walk and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.47s
因此,如您所见,该listdir
版本效率更高。(那join
很慢)
Since every example here is just using walk
(with join
), i’d like to show a nice example and comparison with listdir
:
import os, time
def listFiles1(root): # listdir
allFiles = []; walk = [root]
while walk:
folder = walk.pop(0)+"/"; items = os.listdir(folder) # items = folders + files
for i in items: i=folder+i; (walk if os.path.isdir(i) else allFiles).append(i)
return allFiles
def listFiles2(root): # listdir/join (takes ~1.4x as long) (and uses '\\' instead)
allFiles = []; walk = [root]
while walk:
folder = walk.pop(0); items = os.listdir(folder) # items = folders + files
for i in items: i=os.path.join(folder,i); (walk if os.path.isdir(i) else allFiles).append(i)
return allFiles
def listFiles3(root): # walk (takes ~1.5x as long)
allFiles = []
for folder, folders, files in os.walk(root):
for file in files: allFiles+=[folder.replace("\\","/")+"/"+file] # folder+"\\"+file still ~1.5x
return allFiles
def listFiles4(root): # walk/join (takes ~1.6x as long) (and uses '\\' instead)
allFiles = []
for folder, folders, files in os.walk(root):
for file in files: allFiles+=[os.path.join(folder,file)]
return allFiles
for i in range(100): files = listFiles1("src") # warm up
start = time.time()
for i in range(100): files = listFiles1("src") # listdir
print("Time taken: %.2fs"%(time.time()-start)) # 0.28s
start = time.time()
for i in range(100): files = listFiles2("src") # listdir and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.38s
start = time.time()
for i in range(100): files = listFiles3("src") # walk
print("Time taken: %.2fs"%(time.time()-start)) # 0.42s
start = time.time()
for i in range(100): files = listFiles4("src") # walk and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.47s
So as you can see for yourself, the listdir
version is much more efficient. (and that join
is slow)