如何使用glob()递归查找文件?

问题:如何使用glob()递归查找文件?

这就是我所拥有的:

glob(os.path.join('src','*.c'))

但我想搜索src的子文件夹。这样的事情会起作用:

glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))

但这显然是有限且笨拙的。

This is what I have:

glob(os.path.join('src','*.c'))

but I want to search the subfolders of src. Something like this would work:

glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))

But this is obviously limited and clunky.


回答 0

Python 3.5+

由于您使用的是新的python,因此应pathlib.Path.rglobpathlib模块中使用。

from pathlib import Path

for path in Path('src').rglob('*.c'):
    print(path.name)

如果您不想使用pathlib,只需使用glob.glob,但不要忘记传递recursive关键字参数。

对于匹配文件以点(。)开头的情况;例如当前目录中的文件或基于Unix的系统上的隐藏文件,请使用以下os.walk解决方案。

较旧的Python版本

对于较旧的Python版本,可os.walk用于递归遍历目录并fnmatch.filter与简单表达式匹配:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
    for filename in fnmatch.filter(filenames, '*.c'):
        matches.append(os.path.join(root, filename))

Python 3.5+

Since you’re on a new python, you should use pathlib.Path.rglob from the the pathlib module.

from pathlib import Path

for path in Path('src').rglob('*.c'):
    print(path.name)

If you don’t want to use pathlib, just use glob.glob, but don’t forget to pass in the recursive keyword parameter.

For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk solution below.

Older Python versions

For older Python versions, use os.walk to recursively walk a directory and fnmatch.filter to match against a simple expression:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
    for filename in fnmatch.filter(filenames, '*.c'):
        matches.append(os.path.join(root, filename))

回答 1

与其他解决方案类似,但是使用fnmatch.fnmatch而不是glob,因为os.walk已经列出了文件名:

import os, fnmatch


def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename


for filename in find_files('src', '*.c'):
    print 'Found C source:', filename

另外,使用生成器可以使您处理找到的每个文件,而不是查找所有文件然后进行处理。

Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:

import os, fnmatch


def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename


for filename in find_files('src', '*.c'):
    print 'Found C source:', filename

Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.


回答 2

我修改了glob模块,以支持**用于递归glob,例如:

>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')

https://github.com/miracle2k/python-glob2/

当您想为用户提供使用**语法的能力时很有用,因此仅os.walk()不够好。

I’ve modified the glob module to support ** for recursive globbing, e.g:

>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')

https://github.com/miracle2k/python-glob2/

Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.


回答 3

从Python 3.4开始,可以使用新pathlib模块中支持通配符glob()Path类之一的方法。例如:**

from pathlib import Path

for file_path in Path('src').glob('**/*.c'):
    print(file_path) # do whatever you need with these files

更新: 从Python 3.5开始,glob.glob()

Starting with Python 3.4, one can use the glob() method of one of the Path classes in the new pathlib module, which supports ** wildcards. For example:

from pathlib import Path

for file_path in Path('src').glob('**/*.c'):
    print(file_path) # do whatever you need with these files

Update: Starting with Python 3.5, the same syntax is also supported by glob.glob().


回答 4

import os
import fnmatch


def recursive_glob(treeroot, pattern):
    results = []
    for base, dirs, files in os.walk(treeroot):
        goodfiles = fnmatch.filter(files, pattern)
        results.extend(os.path.join(base, f) for f in goodfiles)
    return results

fnmatch为您提供与完全相同的模式glob,因此对于glob.glob非常紧密的语义而言,这确实是一个很好的替代。迭代的版本(例如生成器),用IOW代替glob.iglob,是微不足道的改编(只是yield中间结果,而不是extend最后返回单个结果列表)。

import os
import fnmatch


def recursive_glob(treeroot, pattern):
    results = []
    for base, dirs, files in os.walk(treeroot):
        goodfiles = fnmatch.filter(files, pattern)
        results.extend(os.path.join(base, f) for f in goodfiles)
    return results

fnmatch gives you exactly the same patterns as glob, so this is really an excellent replacement for glob.glob with very close semantics. An iterative version (e.g. a generator), IOW a replacement for glob.iglob, is a trivial adaptation (just yield the intermediate results as you go, instead of extending a single results list to return at the end).


回答 5

对于python> = 3.5,可以使用**recursive=True

import glob
for x in glob.glob('path/**/*.c', recursive=True):
    print(x)

演示版


如果是递归的True,则模式** 将匹配任何文件以及零个或多个directoriessubdirectories。如果模式后跟一个os.sep,则仅目录和subdirectories匹配项。

For python >= 3.5 you can use **, recursive=True :

import glob
for x in glob.glob('path/**/*.c', recursive=True):
    print(x)

Demo


If recursive is True, the pattern ** will match any files and zero or more directories and subdirectories. If the pattern is followed by an os.sep, only directories and subdirectories match.


回答 6

您将要用来os.walk收集符合条件的文件名。例如:

import os
cfiles = []
for root, dirs, files in os.walk('src'):
  for file in files:
    if file.endswith('.c'):
      cfiles.append(os.path.join(root, file))

You’ll want to use os.walk to collect filenames that match your criteria. For example:

import os
cfiles = []
for root, dirs, files in os.walk('src'):
  for file in files:
    if file.endswith('.c'):
      cfiles.append(os.path.join(root, file))

回答 7

这是一个具有嵌套列表推导的解决方案,os.walk而不是简单的后缀匹配glob

import os
cfiles = [os.path.join(root, filename)
          for root, dirnames, filenames in os.walk('src')
          for filename in filenames if filename.endswith('.c')]

可以将其压缩为单线:

import os;cfiles=[os.path.join(r,f) for r,d,fs in os.walk('src') for f in fs if f.endswith('.c')]

或概括为一个函数:

import os

def recursive_glob(rootdir='.', suffix=''):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames if filename.endswith(suffix)]

cfiles = recursive_glob('src', '.c')

如果您确实需要完整的glob样式模式,则可以遵循Alex和Bruno的示例并使用fnmatch

import fnmatch
import os

def recursive_glob(rootdir='.', pattern='*'):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames
            if fnmatch.fnmatch(filename, pattern)]

cfiles = recursive_glob('src', '*.c')

Here’s a solution with nested list comprehensions, os.walk and simple suffix matching instead of glob:

import os
cfiles = [os.path.join(root, filename)
          for root, dirnames, filenames in os.walk('src')
          for filename in filenames if filename.endswith('.c')]

It can be compressed to a one-liner:

import os;cfiles=[os.path.join(r,f) for r,d,fs in os.walk('src') for f in fs if f.endswith('.c')]

or generalized as a function:

import os

def recursive_glob(rootdir='.', suffix=''):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames if filename.endswith(suffix)]

cfiles = recursive_glob('src', '.c')

If you do need full glob style patterns, you can follow Alex’s and Bruno’s example and use fnmatch:

import fnmatch
import os

def recursive_glob(rootdir='.', pattern='*'):
    return [os.path.join(looproot, filename)
            for looproot, _, filenames in os.walk(rootdir)
            for filename in filenames
            if fnmatch.fnmatch(filename, pattern)]

cfiles = recursive_glob('src', '*.c')

回答 8

最近,我不得不恢复扩展名为.jpg的图片。我运行了photorec并恢复了4579个目录,其中220万个文件具有多种扩展名。使用以下脚本,我能够在几分钟内选择50133个具有.jpg扩展名的文件:

#!/usr/binenv python2.7

import glob
import shutil
import os

src_dir = "/home/mustafa/Masaüstü/yedek"
dst_dir = "/home/mustafa/Genel/media"
for mediafile in glob.iglob(os.path.join(src_dir, "*", "*.jpg")): #"*" is for subdirectory
    shutil.copy(mediafile, dst_dir)

Recently I had to recover my pictures with the extension .jpg. I ran photorec and recovered 4579 directories 2.2 million files within, having tremendous variety of extensions.With the script below I was able to select 50133 files havin .jpg extension within minutes:

#!/usr/binenv python2.7

import glob
import shutil
import os

src_dir = "/home/mustafa/Masaüstü/yedek"
dst_dir = "/home/mustafa/Genel/media"
for mediafile in glob.iglob(os.path.join(src_dir, "*", "*.jpg")): #"*" is for subdirectory
    shutil.copy(mediafile, dst_dir)

回答 9

考虑一下pathlib.rglob()

这就好比调用Path.glob()"**/"在给定的相对图案前面加:

import pathlib


for p in pathlib.Path("src").rglob("*.c"):
    print(p)

另请参阅@taleinat的相关文章和类似的文章其他地方。

Consider pathlib.rglob().

This is like calling Path.glob() with "**/" added in front of the given relative pattern:

import pathlib


for p in pathlib.Path("src").rglob("*.c"):
    print(p)

See also @taleinat’s related post here and a similar post elsewhere.


回答 10

Johan和Bruno针对上述最低要求提供了出色的解决方案。我刚刚发布了实现了Ant FileSet和Globs的Formic,它可以处理这种情况以及更复杂的情况。您的要求的实现是:

import formic
fileset = formic.FileSet(include="/src/**/*.c")
for file_name in fileset.qualified_files():
    print file_name

Johan and Bruno provide excellent solutions on the minimal requirement as stated. I have just released Formic which implements Ant FileSet and Globs which can handle this and more complicated scenarios. An implementation of your requirement is:

import formic
fileset = formic.FileSet(include="/src/**/*.c")
for file_name in fileset.qualified_files():
    print file_name

回答 11

基于其他答案,这是我当前的工作实现,它在根目录中检索嵌套的xml文件:

files = []
for root, dirnames, filenames in os.walk(myDir):
    files.extend(glob.glob(root + "/*.xml"))

我真的很喜欢python :)

based on other answers this is my current working implementation, which retrieves nested xml files in a root directory:

files = []
for root, dirnames, filenames in os.walk(myDir):
    files.extend(glob.glob(root + "/*.xml"))

I’m really having fun with python :)


回答 12

仅使用glob模块执行此操作的另一种方法。只需在rglob方法中添加一个起始基本目录和一个匹配模式即可,它将返回匹配文件名的列表。

import glob
import os

def _getDirs(base):
    return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]

def rglob(base, pattern):
    list = []
    list.extend(glob.glob(os.path.join(base,pattern)))
    dirs = _getDirs(base)
    if len(dirs):
        for d in dirs:
            list.extend(rglob(os.path.join(base,d), pattern))
    return list

Another way to do it using just the glob module. Just seed the rglob method with a starting base directory and a pattern to match and it will return a list of matching file names.

import glob
import os

def _getDirs(base):
    return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]

def rglob(base, pattern):
    list = []
    list.extend(glob.glob(os.path.join(base,pattern)))
    dirs = _getDirs(base)
    if len(dirs):
        for d in dirs:
            list.extend(rglob(os.path.join(base,d), pattern))
    return list

回答 13

或具有列表理解:

 >>> base = r"c:\User\xtofl"
 >>> binfiles = [ os.path.join(base,f) 
            for base, _, files in os.walk(root) 
            for f in files if f.endswith(".jpg") ] 

Or with a list comprehension:

 >>> base = r"c:\User\xtofl"
 >>> binfiles = [ os.path.join(base,f) 
            for base, _, files in os.walk(root) 
            for f in files if f.endswith(".jpg") ] 

回答 14

刚做这个..它将以分层方式打印文件和目录

但是我没有用过fnmatch或walk

#!/usr/bin/python

import os,glob,sys

def dirlist(path, c = 1):

        for i in glob.glob(os.path.join(path, "*")):
                if os.path.isfile(i):
                        filepath, filename = os.path.split(i)
                        print '----' *c + filename

                elif os.path.isdir(i):
                        dirname = os.path.basename(i)
                        print '----' *c + dirname
                        c+=1
                        dirlist(i,c)
                        c-=1


path = os.path.normpath(sys.argv[1])
print(os.path.basename(path))
dirlist(path)

Just made this.. it will print files and directory in hierarchical way

But I didn’t used fnmatch or walk

#!/usr/bin/python

import os,glob,sys

def dirlist(path, c = 1):

        for i in glob.glob(os.path.join(path, "*")):
                if os.path.isfile(i):
                        filepath, filename = os.path.split(i)
                        print '----' *c + filename

                elif os.path.isdir(i):
                        dirname = os.path.basename(i)
                        print '----' *c + dirname
                        c+=1
                        dirlist(i,c)
                        c-=1


path = os.path.normpath(sys.argv[1])
print(os.path.basename(path))
dirlist(path)

回答 15

那使用fnmatch或正则表达式:

import fnmatch, os

def filepaths(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            try:
                matched = pattern.match(basename)
            except AttributeError:
                matched = fnmatch.fnmatch(basename, pattern)
            if matched:
                yield os.path.join(root, basename)

# usage
if __name__ == '__main__':
    from pprint import pprint as pp
    import re
    path = r'/Users/hipertracker/app/myapp'
    pp([x for x in filepaths(path, re.compile(r'.*\.py$'))])
    pp([x for x in filepaths(path, '*.py')])

That one uses fnmatch or regular expression:

import fnmatch, os

def filepaths(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            try:
                matched = pattern.match(basename)
            except AttributeError:
                matched = fnmatch.fnmatch(basename, pattern)
            if matched:
                yield os.path.join(root, basename)

# usage
if __name__ == '__main__':
    from pprint import pprint as pp
    import re
    path = r'/Users/hipertracker/app/myapp'
    pp([x for x in filepaths(path, re.compile(r'.*\.py$'))])
    pp([x for x in filepaths(path, '*.py')])

回答 16

除了建议的答案,您还可以通过一些懒惰的生成和列表理解魔术来做到这一点:

import os, glob, itertools

results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
                                               for root, dirs, files in os.walk('src'))

for f in results: print(f)

除了适合一行并且避免在内存中使用不必要的列表之外,这还具有很好的副作用,即您可以以类似于**运算符的方式使用它,例如,可以使用os.path.join(root, 'some/path/*.c')它来获取所有.c文件。具有此结构的src子目录。

In addition to the suggested answers, you can do this with some lazy generation and list comprehension magic:

import os, glob, itertools

results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
                                               for root, dirs, files in os.walk('src'))

for f in results: print(f)

Besides fitting in one line and avoiding unnecessary lists in memory, this also has the nice side effect, that you can use it in a way similar to the ** operator, e.g., you could use os.path.join(root, 'some/path/*.c') in order to get all .c files in all sub directories of src that have this structure.


回答 17

对于python 3.5及更高版本

import glob

#file_names_array = glob.glob('path/*.c', recursive=True)
#above works for files directly at path/ as guided by NeStack

#updated version
file_names_array = glob.glob('path/**/*.c', recursive=True)

您可能还需要

for full_path_in_src in  file_names_array:
    print (full_path_in_src ) # be like 'abc/xyz.c'
    #Full system path of this would be like => 'path till src/abc/xyz.c'

For python 3.5 and later

import glob

#file_names_array = glob.glob('path/*.c', recursive=True)
#above works for files directly at path/ as guided by NeStack

#updated version
file_names_array = glob.glob('path/**/*.c', recursive=True)

further you might need

for full_path_in_src in  file_names_array:
    print (full_path_in_src ) # be like 'abc/xyz.c'
    #Full system path of this would be like => 'path till src/abc/xyz.c'

回答 18

这是Python 2.7上的有效代码。作为我的devops工作的一部分,我需要编写一个脚本,该脚本会将标有live-appName.properties的配置文件移动到appName.properties。可能还有其他扩展文件,例如live-appName.xml。

以下是用于此目的的工作代码,该代码在给定目录(嵌套级别)中查找文件,然后将其重命名(移动)为所需的文件名

def flipProperties(searchDir):
   print "Flipping properties to point to live DB"
   for root, dirnames, filenames in os.walk(searchDir):
      for filename in fnmatch.filter(filenames, 'live-*.*'):
        targetFileName = os.path.join(root, filename.split("live-")[1])
        print "File "+ os.path.join(root, filename) + "will be moved to " + targetFileName
        shutil.move(os.path.join(root, filename), targetFileName)

从主脚本调用此函数

flipProperties(searchDir)

希望这可以帮助遇到类似问题的人。

This is a working code on Python 2.7. As part of my devops work, I was required to write a script which would move the config files marked with live-appName.properties to appName.properties. There could be other extension files as well like live-appName.xml.

Below is a working code for this, which finds the files in the given directories (nested level) and then renames (moves) it to the required filename

def flipProperties(searchDir):
   print "Flipping properties to point to live DB"
   for root, dirnames, filenames in os.walk(searchDir):
      for filename in fnmatch.filter(filenames, 'live-*.*'):
        targetFileName = os.path.join(root, filename.split("live-")[1])
        print "File "+ os.path.join(root, filename) + "will be moved to " + targetFileName
        shutil.move(os.path.join(root, filename), targetFileName)

This function is called from a main script

flipProperties(searchDir)

Hope this helps someone struggling with similar issues.


回答 19

Johan Dahlin答案的简化版本,不带fnmatch

import os

matches = []
for root, dirnames, filenames in os.walk('src'):
  matches += [os.path.join(root, f) for f in filenames if f[-2:] == '.c']

Simplified version of Johan Dahlin’s answer, without fnmatch.

import os

matches = []
for root, dirnames, filenames in os.walk('src'):
  matches += [os.path.join(root, f) for f in filenames if f[-2:] == '.c']

回答 20

这是我的使用列表推导的解决方案在目录和所有子目录中递归搜索多个文件扩展名的解决方案:

import os, glob

def _globrec(path, *exts):
""" Glob recursively a directory and all subdirectories for multiple file extensions 
    Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
    with .jpg and .JPG

    Parameters
    ----------
    path : str
        A directory name
    exts : tuple
        File extensions to glob for

    Returns
    -------
    files : list
        list of files matching extensions in exts in path and subfolders

    """
    dirs = [a[0] for a in os.walk(path)]
    f_filter = [d+e for d in dirs for e in exts]    
    return [f for files in [glob.iglob(files) for files in f_filter] for f in files]

my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
for f in my_pictures:
    print f

Here is my solution using list comprehension to search for multiple file extensions recursively in a directory and all subdirectories:

import os, glob

def _globrec(path, *exts):
""" Glob recursively a directory and all subdirectories for multiple file extensions 
    Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
    with .jpg and .JPG

    Parameters
    ----------
    path : str
        A directory name
    exts : tuple
        File extensions to glob for

    Returns
    -------
    files : list
        list of files matching extensions in exts in path and subfolders

    """
    dirs = [a[0] for a in os.walk(path)]
    f_filter = [d+e for d in dirs for e in exts]    
    return [f for files in [glob.iglob(files) for files in f_filter] for f in files]

my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
for f in my_pictures:
    print f

回答 21

import sys, os, glob

dir_list = ["c:\\books\\heap"]

while len(dir_list) > 0:
    cur_dir = dir_list[0]
    del dir_list[0]
    list_of_files = glob.glob(cur_dir+'\\*')
    for book in list_of_files:
        if os.path.isfile(book):
            print(book)
        else:
            dir_list.append(book)
import sys, os, glob

dir_list = ["c:\\books\\heap"]

while len(dir_list) > 0:
    cur_dir = dir_list[0]
    del dir_list[0]
    list_of_files = glob.glob(cur_dir+'\\*')
    for book in list_of_files:
        if os.path.isfile(book):
            print(book)
        else:
            dir_list.append(book)

回答 22

我修改了此发布中的最佳答案..并最近创建了此脚本,该脚本将遍历给定目录(searchdir)中的所有文件及其下的子目录…并打印文件名,rootdir,修改/创建日期和尺寸。

希望这对某人有帮助…他们可以遍历目录并获取fileinfo。

import time
import fnmatch
import os

def fileinfo(file):
    filename = os.path.basename(file)
    rootdir = os.path.dirname(file)
    lastmod = time.ctime(os.path.getmtime(file))
    creation = time.ctime(os.path.getctime(file))
    filesize = os.path.getsize(file)

    print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)

searchdir = r'D:\Your\Directory\Root'
matches = []

for root, dirnames, filenames in os.walk(searchdir):
    ##  for filename in fnmatch.filter(filenames, '*.c'):
    for filename in filenames:
        ##      matches.append(os.path.join(root, filename))
        ##print matches
        fileinfo(os.path.join(root, filename))

I modified the top answer in this posting.. and recently created this script which will loop through all files in a given directory (searchdir) and the sub-directories under it… and prints filename, rootdir, modified/creation date, and size.

Hope this helps someone… and they can walk the directory and get fileinfo.

import time
import fnmatch
import os

def fileinfo(file):
    filename = os.path.basename(file)
    rootdir = os.path.dirname(file)
    lastmod = time.ctime(os.path.getmtime(file))
    creation = time.ctime(os.path.getctime(file))
    filesize = os.path.getsize(file)

    print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)

searchdir = r'D:\Your\Directory\Root'
matches = []

for root, dirnames, filenames in os.walk(searchdir):
    ##  for filename in fnmatch.filter(filenames, '*.c'):
    for filename in filenames:
        ##      matches.append(os.path.join(root, filename))
        ##print matches
        fileinfo(os.path.join(root, filename))

回答 23

这是一个将模式与完整路径而不只是基本文件名匹配的解决方案。

它用于fnmatch.translate将glob样式的模式转换为正则表达式,然后将其与在遍历目录时发现的每个文件的完整路径进行匹配。

re.IGNORECASE是可选的,但在Windows上是理想的,因为文件系统本身不区分大小写。(我没有费心编译正则表达式,因为文档表明它应该在内部缓存。)

import fnmatch
import os
import re

def findfiles(dir, pattern):
    patternregex = fnmatch.translate(pattern)
    for root, dirs, files in os.walk(dir):
        for basename in files:
            filename = os.path.join(root, basename)
            if re.search(patternregex, filename, re.IGNORECASE):
                yield filename

Here is a solution that will match the pattern against the full path and not just the base filename.

It uses fnmatch.translate to convert a glob-style pattern into a regular expression, which is then matched against the full path of each file found while walking the directory.

re.IGNORECASE is optional, but desirable on Windows since the file system itself is not case-sensitive. (I didn’t bother compiling the regex because docs indicate it should be cached internally.)

import fnmatch
import os
import re

def findfiles(dir, pattern):
    patternregex = fnmatch.translate(pattern)
    for root, dirs, files in os.walk(dir):
        for basename in files:
            filename = os.path.join(root, basename)
            if re.search(patternregex, filename, re.IGNORECASE):
                yield filename

回答 24

我需要一个解决方案的Python 2.x中,工程上大的目录。
我结束了这一点:

import subprocess
foundfiles= subprocess.check_output("ls src/*.c src/**/*.c", shell=True)
for foundfile in foundfiles.splitlines():
    print foundfile

请注意,如果ls找不到任何匹配文件,您可能需要一些异常处理。

I needed a solution for python 2.x that works fast on large directories.
I endet up with this:

import subprocess
foundfiles= subprocess.check_output("ls src/*.c src/**/*.c", shell=True)
for foundfile in foundfiles.splitlines():
    print foundfile

Note that you might need some exception handling in case ls doesn’t find any matching file.