问题:Python列表目录,子目录和文件

我试图制作一个脚本来列出给定目录中的所有目录,子目录和文件。
我尝试了这个:

import sys,os

root = "/home/patate/directory/"
path = os.path.join(root, "targetdirectory")

for r,d,f in os.walk(path):
    for file in f:
        print os.path.join(root,file)

不幸的是,它无法正常工作。
我得到所有文件,但没有完整的路径。

例如,如果dir结构为:

/home/patate/directory/targetdirectory/123/456/789/file.txt

它会打印:

/home/patate/directory/targetdirectory/file.txt

我需要的是第一个结果。任何帮助将不胜感激!谢谢。

I’m trying to make a script to list all directory, subdirectory, and files in a given directory.
I tried this:

import sys,os

root = "/home/patate/directory/"
path = os.path.join(root, "targetdirectory")

for r,d,f in os.walk(path):
    for file in f:
        print os.path.join(root,file)

Unfortunatly it doesn’t work properly.
I get all the files, but not their complete paths.

For example if the dir struct would be:

/home/patate/directory/targetdirectory/123/456/789/file.txt

It would print:

/home/patate/directory/targetdirectory/file.txt

What I need is the first result. Any help would be greatly appreciated! Thanks.


回答 0

使用os.path.join来连接的目录和文件

for path, subdirs, files in os.walk(root):
    for name in files:
        print os.path.join(path, name)

请注意,path并没有root在串联中使用,因为使用root会不正确。


在Python 3.4中,添加了pathlib模块以简化路径操作。因此,等效于os.path.join

pathlib.PurePath(path, name)

好处pathlib是您可以在路径上使用各种有用的方法。如果使用具体的Path变体,您还可以通过它们进行实际的OS调用,例如,进入目录,删除路径,打开其指向的文件等等。

Use os.path.join to concatenate the directory and file name:

for path, subdirs, files in os.walk(root):
    for name in files:
        print os.path.join(path, name)

Note the usage of path and not root in the concatenation, since using root would be incorrect.


In Python 3.4, the pathlib module was added for easier path manipulations. So the equivalent to os.path.join would be:

pathlib.PurePath(path, name)

The advantage of pathlib is that you can use a variety of useful methods on paths. If you use the concrete Path variant you can also do actual OS calls through them, like changing into a directory, deleting the path, opening the file it points to and much more.


回答 1

以防万一…获取目录和子目录中与某个模式匹配的所有文件(例如* .py):

import os
from fnmatch import fnmatch

root = '/some/directory'
pattern = "*.py"

for path, subdirs, files in os.walk(root):
    for name in files:
        if fnmatch(name, pattern):
            print os.path.join(path, name)

Just in case… Getting all files in the directory and subdirectories matching some pattern (*.py for example):

import os
from fnmatch import fnmatch

root = '/some/directory'
pattern = "*.py"

for path, subdirs, files in os.walk(root):
    for name in files:
        if fnmatch(name, pattern):
            print os.path.join(path, name)

回答 2

这里是单线:

import os

[val for sublist in [[os.path.join(i[0], j) for j in i[2]] for i in os.walk('./')] for val in sublist]
# Meta comment to ease selecting text

最外面的val for sublist in ...循环将列表平整为一维。该j循环收集每个文件名前缀的列表,并将其加入到当前的路径。最后,i循环遍历所有目录和子目录。

本示例./os.walk(...)调用中使用了硬编码的路径,您可以补充所需的任何路径字符串。

注意:和/或可用于路径字符串,例如~/

扩展此示例:

它易于添加文件基本名测试和目录名测试。

例如,测试*.jpg文件:

... for j in i[2] if j.endswith('.jpg')] ...

此外,排除.git目录:

... for i in os.walk('./') if '.git' not in i[0].split('/')]

Here is a one-liner:

import os

[val for sublist in [[os.path.join(i[0], j) for j in i[2]] for i in os.walk('./')] for val in sublist]
# Meta comment to ease selecting text

The outer most val for sublist in ... loop flattens the list to be one dimensional. The j loop collects a list of every file basename and joins it to the current path. Finally, the i loop iterates over all directories and sub directories.

This example uses the hard-coded path ./ in the os.walk(...) call, you can supplement any path string you like.

Note: and/or can be used for paths strings like ~/

Extending this example:

Its easy to add in file basename tests and directoryname tests.

For Example, testing for *.jpg files:

... for j in i[2] if j.endswith('.jpg')] ...

Additionally, excluding the .git directory:

... for i in os.walk('./') if '.git' not in i[0].split('/')]

回答 3

无法发表评论,请在此处写答案。这是我所看到的最清晰的一行:

import os
[os.path.join(path, name) for path, subdirs, files in os.walk(root) for name in files]

Couldn’t comment so writing answer here. This is the clearest one-line I have seen:

import os
[os.path.join(path, name) for path, subdirs, files in os.walk(root) for name in files]

回答 4

您可以看一下我制作的这个样本。它使用了os.path.walk函数,该函数已被提倡,使用列表存储所有文件路径

root = "Your root directory"
ex = ".txt"
where_to = "Wherever you wanna write your file to"
def fileWalker(ext,dirname,names):
    '''
    checks files in names'''
    pat = "*" + ext[0]
    for f in names:
        if fnmatch.fnmatch(f,pat):
            ext[1].append(os.path.join(dirname,f))


def writeTo(fList):

    with open(where_to,"w") as f:
        for di_r in fList:
            f.write(di_r + "\n")






if __name__ == '__main__':
    li = []
    os.path.walk(root,fileWalker,[ex,li])

    writeTo(li)

You can take a look at this sample I made. It uses the os.path.walk function which is deprecated beware.Uses a list to store all the filepaths

root = "Your root directory"
ex = ".txt"
where_to = "Wherever you wanna write your file to"
def fileWalker(ext,dirname,names):
    '''
    checks files in names'''
    pat = "*" + ext[0]
    for f in names:
        if fnmatch.fnmatch(f,pat):
            ext[1].append(os.path.join(dirname,f))


def writeTo(fList):

    with open(where_to,"w") as f:
        for di_r in fList:
            f.write(di_r + "\n")






if __name__ == '__main__':
    li = []
    os.path.walk(root,fileWalker,[ex,li])

    writeTo(li)

回答 5

一线简单一点:

import os
from itertools import product, chain

chain.from_iterable([[os.sep.join(w) for w in product([i[0]], i[2])] for i in os.walk(dir)])

A bit simpler one-liner:

import os
from itertools import product, chain

chain.from_iterable([[os.sep.join(w) for w in product([i[0]], i[2])] for i in os.walk(dir)])

回答 6

由于这里的每个示例都仅使用walk(with join),因此我想展示一个不错的示例并与进行比较listdir

import os, time

def listFiles1(root): # listdir
    allFiles = []; walk = [root]
    while walk:
        folder = walk.pop(0)+"/"; items = os.listdir(folder) # items = folders + files
        for i in items: i=folder+i; (walk if os.path.isdir(i) else allFiles).append(i)
    return allFiles

def listFiles2(root): # listdir/join (takes ~1.4x as long) (and uses '\\' instead)
    allFiles = []; walk = [root]
    while walk:
        folder = walk.pop(0); items = os.listdir(folder) # items = folders + files
        for i in items: i=os.path.join(folder,i); (walk if os.path.isdir(i) else allFiles).append(i)
    return allFiles

def listFiles3(root): # walk (takes ~1.5x as long)
    allFiles = []
    for folder, folders, files in os.walk(root):
        for file in files: allFiles+=[folder.replace("\\","/")+"/"+file] # folder+"\\"+file still ~1.5x
    return allFiles

def listFiles4(root): # walk/join (takes ~1.6x as long) (and uses '\\' instead)
    allFiles = []
    for folder, folders, files in os.walk(root):
        for file in files: allFiles+=[os.path.join(folder,file)]
    return allFiles


for i in range(100): files = listFiles1("src") # warm up

start = time.time()
for i in range(100): files = listFiles1("src") # listdir
print("Time taken: %.2fs"%(time.time()-start)) # 0.28s

start = time.time()
for i in range(100): files = listFiles2("src") # listdir and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.38s

start = time.time()
for i in range(100): files = listFiles3("src") # walk
print("Time taken: %.2fs"%(time.time()-start)) # 0.42s

start = time.time()
for i in range(100): files = listFiles4("src") # walk and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.47s

因此,如您所见,该listdir版本效率更高。(那join很慢)

Since every example here is just using walk (with join), i’d like to show a nice example and comparison with listdir:

import os, time

def listFiles1(root): # listdir
    allFiles = []; walk = [root]
    while walk:
        folder = walk.pop(0)+"/"; items = os.listdir(folder) # items = folders + files
        for i in items: i=folder+i; (walk if os.path.isdir(i) else allFiles).append(i)
    return allFiles

def listFiles2(root): # listdir/join (takes ~1.4x as long) (and uses '\\' instead)
    allFiles = []; walk = [root]
    while walk:
        folder = walk.pop(0); items = os.listdir(folder) # items = folders + files
        for i in items: i=os.path.join(folder,i); (walk if os.path.isdir(i) else allFiles).append(i)
    return allFiles

def listFiles3(root): # walk (takes ~1.5x as long)
    allFiles = []
    for folder, folders, files in os.walk(root):
        for file in files: allFiles+=[folder.replace("\\","/")+"/"+file] # folder+"\\"+file still ~1.5x
    return allFiles

def listFiles4(root): # walk/join (takes ~1.6x as long) (and uses '\\' instead)
    allFiles = []
    for folder, folders, files in os.walk(root):
        for file in files: allFiles+=[os.path.join(folder,file)]
    return allFiles


for i in range(100): files = listFiles1("src") # warm up

start = time.time()
for i in range(100): files = listFiles1("src") # listdir
print("Time taken: %.2fs"%(time.time()-start)) # 0.28s

start = time.time()
for i in range(100): files = listFiles2("src") # listdir and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.38s

start = time.time()
for i in range(100): files = listFiles3("src") # walk
print("Time taken: %.2fs"%(time.time()-start)) # 0.42s

start = time.time()
for i in range(100): files = listFiles4("src") # walk and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.47s

So as you can see for yourself, the listdir version is much more efficient. (and that join is slow)


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。