标签归档:directory-structure

在python中列出目录树结构?

问题:在python中列出目录树结构?

我知道我们可以os.walk()用来列出目录中的所有子目录或所有文件。但是,我想列出完整的目录树内容:

- Subdirectory 1:
   - file11
   - file12
   - Sub-sub-directory 11:
         - file111
         - file112
- Subdirectory 2:
    - file21
    - sub-sub-directory 21
    - sub-sub-directory 22    
        - sub-sub-sub-directory 221
            - file 2211

如何在Python中最好地实现这一目标?

I know that we can use os.walk() to list all sub-directories or all files in a directory. However, I would like to list the full directory tree content:

- Subdirectory 1:
   - file11
   - file12
   - Sub-sub-directory 11:
         - file111
         - file112
- Subdirectory 2:
    - file21
    - sub-sub-directory 21
    - sub-sub-directory 22    
        - sub-sub-sub-directory 221
            - file 2211

How to best achieve this in Python?


回答 0

这是一个用于格式化的函数:

import os

def list_files(startpath):
    for root, dirs, files in os.walk(startpath):
        level = root.replace(startpath, '').count(os.sep)
        indent = ' ' * 4 * (level)
        print('{}{}/'.format(indent, os.path.basename(root)))
        subindent = ' ' * 4 * (level + 1)
        for f in files:
            print('{}{}'.format(subindent, f))

Here’s a function to do that with formatting:

import os

def list_files(startpath):
    for root, dirs, files in os.walk(startpath):
        level = root.replace(startpath, '').count(os.sep)
        indent = ' ' * 4 * (level)
        print('{}{}/'.format(indent, os.path.basename(root)))
        subindent = ' ' * 4 * (level + 1)
        for f in files:
            print('{}{}'.format(subindent, f))

回答 1

与上述答案类似,但对于python3来说,可以说是可读性和可扩展性的:

from pathlib import Path

class DisplayablePath(object):
    display_filename_prefix_middle = '├──'
    display_filename_prefix_last = '└──'
    display_parent_prefix_middle = '    '
    display_parent_prefix_last = '│   '

    def __init__(self, path, parent_path, is_last):
        self.path = Path(str(path))
        self.parent = parent_path
        self.is_last = is_last
        if self.parent:
            self.depth = self.parent.depth + 1
        else:
            self.depth = 0

    @property
    def displayname(self):
        if self.path.is_dir():
            return self.path.name + '/'
        return self.path.name

    @classmethod
    def make_tree(cls, root, parent=None, is_last=False, criteria=None):
        root = Path(str(root))
        criteria = criteria or cls._default_criteria

        displayable_root = cls(root, parent, is_last)
        yield displayable_root

        children = sorted(list(path
                               for path in root.iterdir()
                               if criteria(path)),
                          key=lambda s: str(s).lower())
        count = 1
        for path in children:
            is_last = count == len(children)
            if path.is_dir():
                yield from cls.make_tree(path,
                                         parent=displayable_root,
                                         is_last=is_last,
                                         criteria=criteria)
            else:
                yield cls(path, displayable_root, is_last)
            count += 1

    @classmethod
    def _default_criteria(cls, path):
        return True

    @property
    def displayname(self):
        if self.path.is_dir():
            return self.path.name + '/'
        return self.path.name

    def displayable(self):
        if self.parent is None:
            return self.displayname

        _filename_prefix = (self.display_filename_prefix_last
                            if self.is_last
                            else self.display_filename_prefix_middle)

        parts = ['{!s} {!s}'.format(_filename_prefix,
                                    self.displayname)]

        parent = self.parent
        while parent and parent.parent is not None:
            parts.append(self.display_parent_prefix_middle
                         if parent.is_last
                         else self.display_parent_prefix_last)
            parent = parent.parent

        return ''.join(reversed(parts))

用法示例:

paths = DisplayablePath.make_tree(Path('doc'))
for path in paths:
    print(path.displayable())

输出示例:

doc/
├── _static/
   ├── embedded/
      ├── deep_file
      └── very/
          └── deep/
              └── folder/
                  └── very_deep_file
   └── less_deep_file
├── about.rst
├── conf.py
└── index.rst

笔记

  • 这使用递归。它将在非常深的文件夹树上引发RecursionError
  • 懒惰地评估树。它在非常宽的文件夹树上应该表现良好。但是,给定文件夹的直系子代不会被延迟计算。

编辑:

  • 增加了奖励!用于过滤路径的条件回调。

Similar to answers above, but for python3, arguably readable and arguably extensible:

from pathlib import Path

class DisplayablePath(object):
    display_filename_prefix_middle = '├──'
    display_filename_prefix_last = '└──'
    display_parent_prefix_middle = '    '
    display_parent_prefix_last = '│   '

    def __init__(self, path, parent_path, is_last):
        self.path = Path(str(path))
        self.parent = parent_path
        self.is_last = is_last
        if self.parent:
            self.depth = self.parent.depth + 1
        else:
            self.depth = 0

    @property
    def displayname(self):
        if self.path.is_dir():
            return self.path.name + '/'
        return self.path.name

    @classmethod
    def make_tree(cls, root, parent=None, is_last=False, criteria=None):
        root = Path(str(root))
        criteria = criteria or cls._default_criteria

        displayable_root = cls(root, parent, is_last)
        yield displayable_root

        children = sorted(list(path
                               for path in root.iterdir()
                               if criteria(path)),
                          key=lambda s: str(s).lower())
        count = 1
        for path in children:
            is_last = count == len(children)
            if path.is_dir():
                yield from cls.make_tree(path,
                                         parent=displayable_root,
                                         is_last=is_last,
                                         criteria=criteria)
            else:
                yield cls(path, displayable_root, is_last)
            count += 1

    @classmethod
    def _default_criteria(cls, path):
        return True

    @property
    def displayname(self):
        if self.path.is_dir():
            return self.path.name + '/'
        return self.path.name

    def displayable(self):
        if self.parent is None:
            return self.displayname

        _filename_prefix = (self.display_filename_prefix_last
                            if self.is_last
                            else self.display_filename_prefix_middle)

        parts = ['{!s} {!s}'.format(_filename_prefix,
                                    self.displayname)]

        parent = self.parent
        while parent and parent.parent is not None:
            parts.append(self.display_parent_prefix_middle
                         if parent.is_last
                         else self.display_parent_prefix_last)
            parent = parent.parent

        return ''.join(reversed(parts))

Example usage:

paths = DisplayablePath.make_tree(Path('doc'))
for path in paths:
    print(path.displayable())

Example output:

doc/
├── _static/
│   ├── embedded/
│   │   ├── deep_file
│   │   └── very/
│   │       └── deep/
│   │           └── folder/
│   │               └── very_deep_file
│   └── less_deep_file
├── about.rst
├── conf.py
└── index.rst

Notes

  • This uses recursion. It will raise a RecursionError on really deep folder trees
  • The tree is lazily evaluated. It should behave well on really wide folder trees. Immediate children of a given folder are not lazily evaluated, though.

Edit:

  • Added bonus! criteria callback for filtering paths.

回答 2

没有缩进的解决方案:

for path, dirs, files in os.walk(given_path):
  print path
  for f in files:
    print f

os.walk已经完成了您要寻找的自上而下,深度优先的步行。

忽略Dirs列表可避免您提到的重叠。

A solution without your indentation:

for path, dirs, files in os.walk(given_path):
  print path
  for f in files:
    print f

os.walk already does the top-down, depth-first walk you are looking for.

Ignoring the dirs list prevents the overlapping you mention.


回答 3

列出Python中的目录树结构?

我们通常更喜欢只使用GNU树,但是我们并不总是tree在每个系统上都有,有时Python 3可用。一个很好的答案很容易复制粘贴,而不是tree要求GNU 。

tree的输出如下所示:

$ tree
.
├── package
   ├── __init__.py
   ├── __main__.py
   ├── subpackage
      ├── __init__.py
      ├── __main__.py
      └── module.py
   └── subpackage2
       ├── __init__.py
       ├── __main__.py
       └── module2.py
└── package2
    └── __init__.py

4 directories, 9 files

我在我的主目录下创建了上述目录结构pyscratch

我在这里还看到了其他类似的答案,可以实现这种输出,但是我认为我们可以通过使用更简单,更现代的代码并懒惰地评估这些方法来做得更好。

Python中的树

首先,让我们使用一个示例

  • 使用Python 3 Path对象
  • 使用yieldand yield from表达式(创建一个生成器函数)
  • 使用递归实现优雅的简洁性
  • 使用注释和一些类型注释以更加清晰
from pathlib import Path

# prefix components:
space =  '    '
branch = '│   '
# pointers:
tee =    '├── '
last =   '└── '


def tree(dir_path: Path, prefix: str=''):
    """A recursive generator, given a directory Path object
    will yield a visual tree structure line by line
    with each line prefixed by the same characters
    """    
    contents = list(dir_path.iterdir())
    # contents each get pointers that are ├── with a final └── :
    pointers = [tee] * (len(contents) - 1) + [last]
    for pointer, path in zip(pointers, contents):
        yield prefix + pointer + path.name
        if path.is_dir(): # extend the prefix and recurse:
            extension = branch if pointer == tee else space 
            # i.e. space because last, └── , above so no more |
            yield from tree(path, prefix=prefix+extension)

现在:

for line in tree(Path.home() / 'pyscratch'):
    print(line)

印刷品:

├── package
   ├── __init__.py
   ├── __main__.py
   ├── subpackage
      ├── __init__.py
      ├── __main__.py
      └── module.py
   └── subpackage2
       ├── __init__.py
       ├── __main__.py
       └── module2.py
└── package2
    └── __init__.py

我们确实需要将每个目录具体化为一个列表,因为我们需要知道目录的长度,但是之后我们将列表丢弃。对于深度和广泛的递归,这应该足够懒惰。

上面的代码,加上注释,应该足以完全理解我们在这里所做的事情,但是如果需要,可以随时使用调试器逐步调试它,以便更好地了解它。

更多功能

现在,GNU tree为我们提供了一些我想使用此功能的有用功能:

  • 首先打印主题目录名称(自动打印,我们不打印)
  • 打印计数 n directories, m files
  • 限制递归的选项, -L level
  • 仅限于目录的选项, -d

此外,当有一棵大树时,限制迭代次数(例如使用islice)是有用的,以避免用文本锁定您的解释器,因为有时输出变得太冗长而无用。我们可以默认将其设置为任意高-例如1000

因此,让我们删除先前的注释并填写此功能:

from pathlib import Path
from itertools import islice

space =  '    '
branch = '│   '
tee =    '├── '
last =   '└── '
def tree(dir_path: Path, level: int=-1, limit_to_directories: bool=False,
         length_limit: int=1000):
    """Given a directory Path object print a visual tree structure"""
    dir_path = Path(dir_path) # accept string coerceable to Path
    files = 0
    directories = 0
    def inner(dir_path: Path, prefix: str='', level=-1):
        nonlocal files, directories
        if not level: 
            return # 0, stop iterating
        if limit_to_directories:
            contents = [d for d in dir_path.iterdir() if d.is_dir()]
        else: 
            contents = list(dir_path.iterdir())
        pointers = [tee] * (len(contents) - 1) + [last]
        for pointer, path in zip(pointers, contents):
            if path.is_dir():
                yield prefix + pointer + path.name
                directories += 1
                extension = branch if pointer == tee else space 
                yield from inner(path, prefix=prefix+extension, level=level-1)
            elif not limit_to_directories:
                yield prefix + pointer + path.name
                files += 1
    print(dir_path.name)
    iterator = inner(dir_path, level=level)
    for line in islice(iterator, length_limit):
        print(line)
    if next(iterator, None):
        print(f'... length_limit, {length_limit}, reached, counted:')
    print(f'\n{directories} directories' + (f', {files} files' if files else ''))

现在我们可以得到与以下相同的输出tree

tree(Path.home() / 'pyscratch')

印刷品:

pyscratch
├── package
   ├── __init__.py
   ├── __main__.py
   ├── subpackage
      ├── __init__.py
      ├── __main__.py
      └── module.py
   └── subpackage2
       ├── __init__.py
       ├── __main__.py
       └── module2.py
└── package2
    └── __init__.py

4 directories, 9 files

我们可以限制级别:

tree(Path.home() / 'pyscratch', level=2)

印刷品:

pyscratch
├── package
   ├── __init__.py
   ├── __main__.py
   ├── subpackage
   └── subpackage2
└── package2
    └── __init__.py

4 directories, 3 files

我们可以将输出限制为目录:

tree(Path.home() / 'pyscratch', level=2, limit_to_directories=True)

印刷品:

pyscratch
├── package
   ├── subpackage
   └── subpackage2
└── package2

4 directories

回顾性

回想起来,我们本可以用于path.glob匹配。我们也许还可以path.rglob用于递归遍历,但这将需要重写。我们也可以使用itertools.tee而不是具体化目录内容列表,但这可能会有负面的折衷,并且可能会使代码变得更加复杂。

欢迎发表评论!

List directory tree structure in Python?

We usually prefer to just use GNU tree, but we don’t always have tree on every system, and sometimes Python 3 is available. A good answer here could be easily copy-pasted and not make GNU tree a requirement.

tree‘s output looks like this:

$ tree
.
├── package
│   ├── __init__.py
│   ├── __main__.py
│   ├── subpackage
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   └── module.py
│   └── subpackage2
│       ├── __init__.py
│       ├── __main__.py
│       └── module2.py
└── package2
    └── __init__.py

4 directories, 9 files

I created the above directory structure in my home directory under a directory I call pyscratch.

I also see other answers here that approach that sort of output, but I think we can do better, with simpler, more modern code and lazily evaluating approaches.

Tree in Python

To begin with, let’s use an example that

  • uses the Python 3 Path object
  • uses the yield and yield from expressions (that create a generator function)
  • uses recursion for elegant simplicity
  • uses comments and some type annotations for extra clarity
from pathlib import Path

# prefix components:
space =  '    '
branch = '│   '
# pointers:
tee =    '├── '
last =   '└── '


def tree(dir_path: Path, prefix: str=''):
    """A recursive generator, given a directory Path object
    will yield a visual tree structure line by line
    with each line prefixed by the same characters
    """    
    contents = list(dir_path.iterdir())
    # contents each get pointers that are ├── with a final └── :
    pointers = [tee] * (len(contents) - 1) + [last]
    for pointer, path in zip(pointers, contents):
        yield prefix + pointer + path.name
        if path.is_dir(): # extend the prefix and recurse:
            extension = branch if pointer == tee else space 
            # i.e. space because last, └── , above so no more |
            yield from tree(path, prefix=prefix+extension)

and now:

for line in tree(Path.home() / 'pyscratch'):
    print(line)

prints:

├── package
│   ├── __init__.py
│   ├── __main__.py
│   ├── subpackage
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   └── module.py
│   └── subpackage2
│       ├── __init__.py
│       ├── __main__.py
│       └── module2.py
└── package2
    └── __init__.py

We do need to materialize each directory into a list because we need to know how long it is, but afterwards we throw the list away. For deep and broad recursion this should be lazy enough.

The above code, with the comments, should be sufficient to fully understand what we’re doing here, but feel free to step through it with a debugger to better grock it if you need to.

More features

Now GNU tree gives us a couple of useful features that I’d like to have with this function:

  • prints the subject directory name first (does so automatically, ours does not)
  • prints the count of n directories, m files
  • option to limit recursion, -L level
  • option to limit to just directories, -d

Also, when there is a huge tree, it is useful to limit the iteration (e.g. with islice) to avoid locking up your interpreter with text, as at some point the output becomes too verbose to be useful. We can make this arbitrarily high by default – say 1000.

So let’s remove the previous comments and fill out this functionality:

from pathlib import Path
from itertools import islice

space =  '    '
branch = '│   '
tee =    '├── '
last =   '└── '
def tree(dir_path: Path, level: int=-1, limit_to_directories: bool=False,
         length_limit: int=1000):
    """Given a directory Path object print a visual tree structure"""
    dir_path = Path(dir_path) # accept string coerceable to Path
    files = 0
    directories = 0
    def inner(dir_path: Path, prefix: str='', level=-1):
        nonlocal files, directories
        if not level: 
            return # 0, stop iterating
        if limit_to_directories:
            contents = [d for d in dir_path.iterdir() if d.is_dir()]
        else: 
            contents = list(dir_path.iterdir())
        pointers = [tee] * (len(contents) - 1) + [last]
        for pointer, path in zip(pointers, contents):
            if path.is_dir():
                yield prefix + pointer + path.name
                directories += 1
                extension = branch if pointer == tee else space 
                yield from inner(path, prefix=prefix+extension, level=level-1)
            elif not limit_to_directories:
                yield prefix + pointer + path.name
                files += 1
    print(dir_path.name)
    iterator = inner(dir_path, level=level)
    for line in islice(iterator, length_limit):
        print(line)
    if next(iterator, None):
        print(f'... length_limit, {length_limit}, reached, counted:')
    print(f'\n{directories} directories' + (f', {files} files' if files else ''))

And now we can get the same sort of output as tree:

tree(Path.home() / 'pyscratch')

prints:

pyscratch
├── package
│   ├── __init__.py
│   ├── __main__.py
│   ├── subpackage
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   └── module.py
│   └── subpackage2
│       ├── __init__.py
│       ├── __main__.py
│       └── module2.py
└── package2
    └── __init__.py

4 directories, 9 files

And we can restrict to levels:

tree(Path.home() / 'pyscratch', level=2)

prints:

pyscratch
├── package
│   ├── __init__.py
│   ├── __main__.py
│   ├── subpackage
│   └── subpackage2
└── package2
    └── __init__.py

4 directories, 3 files

And we can limit the output to directories:

tree(Path.home() / 'pyscratch', level=2, limit_to_directories=True)

prints:

pyscratch
├── package
│   ├── subpackage
│   └── subpackage2
└── package2

4 directories

Retrospective

In retrospect, we could have used path.glob for matching. We could also perhaps use path.rglob for recursive globbing, but that would require a rewrite. We could also use itertools.tee instead of materializing a list of directory contents, but that could have negative tradeoffs and would probably make the code even more complex.

Comments are welcome!


回答 4

我来这里寻找相同的东西,并用dhobbs回答我。作为感谢社区的一种方式,按照akshay的要求,我添加了一些参数来写入文件,并使显示文件成为可选文件,因此它并不是输出。还使缩进成为可选参数,以便您可以更改它,例如有些人喜欢将其设为2,而另一些人喜欢采用4。

使用了不同的循环,因此不显示文件的循环不会在每次迭代中检查是否必须这样做。

希望它能对其他人有所帮助,因为dhobbs的回答对我有所帮助。非常感谢。

def showFolderTree(path,show_files=False,indentation=2,file_output=False):
"""
Shows the content of a folder in a tree structure.
path -(string)- path of the root folder we want to show.
show_files -(boolean)-  Whether or not we want to see files listed.
                        Defaults to False.
indentation -(int)- Indentation we want to use, defaults to 2.   
file_output -(string)-  Path (including the name) of the file where we want
                        to save the tree.
"""


tree = []

if not show_files:
    for root, dirs, files in os.walk(path):
        level = root.replace(path, '').count(os.sep)
        indent = ' '*indentation*(level)
        tree.append('{}{}/'.format(indent,os.path.basename(root)))

if show_files:
    for root, dirs, files in os.walk(path):
        level = root.replace(path, '').count(os.sep)
        indent = ' '*indentation*(level)
        tree.append('{}{}/'.format(indent,os.path.basename(root)))    
        for f in files:
            subindent=' ' * indentation * (level+1)
            tree.append('{}{}'.format(subindent,f))

if file_output:
    output_file = open(file_output,'w')
    for line in tree:
        output_file.write(line)
        output_file.write('\n')
else:
    # Default behaviour: print on screen.
    for line in tree:
        print line

I came here looking for the same thing and used dhobbs answer for me. As a way of thanking the community, I added some arguments to write to a file, as akshay asked, and made showing files optional so it is not so bit an output. Also made the indentation an optional argument so you can change it, as some like it to be 2 and others prefer 4.

Used different loops so the one not showing files doesn’t check if it has to on each iteration.

Hope it helps someone else as dhobbs answer helped me. Thanks a lot.

def showFolderTree(path,show_files=False,indentation=2,file_output=False):
"""
Shows the content of a folder in a tree structure.
path -(string)- path of the root folder we want to show.
show_files -(boolean)-  Whether or not we want to see files listed.
                        Defaults to False.
indentation -(int)- Indentation we want to use, defaults to 2.   
file_output -(string)-  Path (including the name) of the file where we want
                        to save the tree.
"""


tree = []

if not show_files:
    for root, dirs, files in os.walk(path):
        level = root.replace(path, '').count(os.sep)
        indent = ' '*indentation*(level)
        tree.append('{}{}/'.format(indent,os.path.basename(root)))

if show_files:
    for root, dirs, files in os.walk(path):
        level = root.replace(path, '').count(os.sep)
        indent = ' '*indentation*(level)
        tree.append('{}{}/'.format(indent,os.path.basename(root)))    
        for f in files:
            subindent=' ' * indentation * (level+1)
            tree.append('{}{}'.format(subindent,f))

if file_output:
    output_file = open(file_output,'w')
    for line in tree:
        output_file.write(line)
        output_file.write('\n')
else:
    # Default behaviour: print on screen.
    for line in tree:
        print line

回答 5

基于这个很棒的帖子

http://code.activestate.com/recipes/217212-treepy-graphically-displays-the-directory-structur/

这是一种行为,完全像

http://linux.die.net/man/1/tree

#!/ usr / bin / env python2 #-*-编码:utf-8-*-


#tree.py #作者:Doug DAHMS #打印的树结构的命令行上指定的路径





OS 进口listdir同时九月
 操作系统路径导入abspath basename 来自sys import argv的isdir


def dir padding print_files = False isLast = False isFirst = False ):如果isFirst 打印填充解码'utf8' )[:- 1 ]。编码'utf8' + 目录
     其他如果isLast 打印填充解码'utf8' )[:- 1 ]。
    
         
        
            编码'utf8' + '└──' + 基本名称abspath dir ))else 打印填充解码'utf8' )[:- 1 ]。编码'utf8' + '├──' + 基本名称abspath dir ))
    文件= [],如果print_files 
        文件= listdir dir 否则   
        
                
    
    
        文件= [ X X listdir同时DIR 如果ISDIR DIR + + X )] 如果isFirst 
        填充= 填充+ '' 
    文件= 排序文件= 拉姆达小号小号降低())
    计数= 0 
    最后= len   
       文件- 1 文件枚举文件):
        计数+ = 1 
        路径= DIR + + 文件  
     
        isLast = i == 最后一次,
         如果isdir path ):if count == len files ):if isFirst 
                    tree path padding print_files isLast False else 
                    tree path padding + '' print_files isLast False 
            
                 
                  
            else:
                tree(path, padding + '│', print_files, isLast, False)
        else:
            if isLast:
                print padding + '└── ' + file
            else:
                print padding + '├── ' + file

def usage():
    return '''Usage: %s [-f] 
Print tree structure of path specified.
Options:
-f      Print files as well as directories
PATH    Path to process''' % basename(argv[0])

def main():
    if len(argv) == 1:
        print usage()
    elif len(argv) == 2:
        # print just directories
        path = argv[1]
        if isdir(path):
            tree(path, '', False, False, True)
        else:
            print 'ERROR: \'' + path + '\' is not a directory'
    elif len(argv) == 3 and argv[1] == '-f':
        # print directories and files
        path = argv[2]
        if isdir(path):
            tree(path, '', True, False, True)
        else:
            print 'ERROR: \'' + path + '\'不是目录' else 打印用法()
    
        

如果__name__ == ' __main__ ' 
    main () 

Based on this fantastic post

http://code.activestate.com/recipes/217212-treepy-graphically-displays-the-directory-structur/

Here es a refinement to behave exactly like

http://linux.die.net/man/1/tree

#!/usr/bin/env python2
# -*- coding: utf-8 -*-

# tree.py
#
# Written by Doug Dahms
#
# Prints the tree structure for the path specified on the command line

from os import listdir, sep
from os.path import abspath, basename, isdir
from sys import argv

def tree(dir, padding, print_files=False, isLast=False, isFirst=False):
    if isFirst:
        print padding.decode('utf8')[:-1].encode('utf8') + dir
    else:
        if isLast:
            print padding.decode('utf8')[:-1].encode('utf8') + '└── ' + basename(abspath(dir))
        else:
            print padding.decode('utf8')[:-1].encode('utf8') + '├── ' + basename(abspath(dir))
    files = []
    if print_files:
        files = listdir(dir)
    else:
        files = [x for x in listdir(dir) if isdir(dir + sep + x)]
    if not isFirst:
        padding = padding + '   '
    files = sorted(files, key=lambda s: s.lower())
    count = 0
    last = len(files) - 1
    for i, file in enumerate(files):
        count += 1
        path = dir + sep + file
        isLast = i == last
        if isdir(path):
            if count == len(files):
                if isFirst:
                    tree(path, padding, print_files, isLast, False)
                else:
                    tree(path, padding + ' ', print_files, isLast, False)
            else:
                tree(path, padding + '│', print_files, isLast, False)
        else:
            if isLast:
                print padding + '└── ' + file
            else:
                print padding + '├── ' + file

def usage():
    return '''Usage: %s [-f] 
Print tree structure of path specified.
Options:
-f      Print files as well as directories
PATH    Path to process''' % basename(argv[0])

def main():
    if len(argv) == 1:
        print usage()
    elif len(argv) == 2:
        # print just directories
        path = argv[1]
        if isdir(path):
            tree(path, '', False, False, True)
        else:
            print 'ERROR: \'' + path + '\' is not a directory'
    elif len(argv) == 3 and argv[1] == '-f':
        # print directories and files
        path = argv[2]
        if isdir(path):
            tree(path, '', True, False, True)
        else:
            print 'ERROR: \'' + path + '\' is not a directory'
    else:
        print usage()

if __name__ == '__main__':
    main()



回答 6

import os

def fs_tree_to_dict(path_):
    file_token = ''
    for root, dirs, files in os.walk(path_):
        tree = {d: fs_tree_to_dict(os.path.join(root, d)) for d in dirs}
        tree.update({f: file_token for f in files})
        return tree  # note we discontinue iteration trough os.walk

如果有人感兴趣,则该递归函数将返回字典的嵌套结构。键是file system(目录和文件的)名称,值是:

  • 目录子词典
  • 文件字符串(请参阅file_token

在此示例中,指定文件的字符串为空。例如,还可以为它们提供文件内容,其所有者信息或特权或与dict不同的任何对象。除非它是字典,否则在后续操作中可以很容易地将其与“目录类型”区分开。

在文件系统中具有这样的树:

# bash:
$ tree /tmp/ex
/tmp/ex
├── d_a
   ├── d_a_a
   ├── d_a_b
      └── f1.txt
   ├── d_a_c
   └── fa.txt
├── d_b
   ├── fb1.txt
   └── fb2.txt
└── d_c

结果将是:

# python 2 or 3:
>>> fs_tree_to_dict("/tmp/ex")
{
    'd_a': {
        'd_a_a': {},
        'd_a_b': {
            'f1.txt': ''
        },
        'd_a_c': {},
        'fa.txt': ''
    },
    'd_b': {
        'fb1.txt': '',
        'fb2.txt': ''
    },
    'd_c': {}
}

如果您愿意,我已经用这个东西(和一个很好的pyfakefs助手)创建了一个包(python 2和3 ):https : //pypi.org/project/fsforge/

import os

def fs_tree_to_dict(path_):
    file_token = ''
    for root, dirs, files in os.walk(path_):
        tree = {d: fs_tree_to_dict(os.path.join(root, d)) for d in dirs}
        tree.update({f: file_token for f in files})
        return tree  # note we discontinue iteration trough os.walk

If anybody is interested – that recursive function returns nested structure of dictionaries. Keys are file system names (of directories and files), values are either:

  • sub dictionaries for directories
  • strings for files (see file_token)

The strings designating files are empty in this example. They can also be e.g. given file contents or its owner info or privileges or whatever object different than a dict. Unless it’s a dictionary it can be easily distinguished from a “directory type” in further operations.

Having such a tree in a filesystem:

# bash:
$ tree /tmp/ex
/tmp/ex
├── d_a
│   ├── d_a_a
│   ├── d_a_b
│   │   └── f1.txt
│   ├── d_a_c
│   └── fa.txt
├── d_b
│   ├── fb1.txt
│   └── fb2.txt
└── d_c

The result will be:

# python 2 or 3:
>>> fs_tree_to_dict("/tmp/ex")
{
    'd_a': {
        'd_a_a': {},
        'd_a_b': {
            'f1.txt': ''
        },
        'd_a_c': {},
        'fa.txt': ''
    },
    'd_b': {
        'fb1.txt': '',
        'fb2.txt': ''
    },
    'd_c': {}
}

If you like that, I’ve already created a package (python 2 & 3) with this stuff (and a nice pyfakefs helper): https://pypi.org/project/fsforge/


回答 7

除了上面的dhobbs答案(https://stackoverflow.com/a/9728478/624597)之外,这里还有将结果存储到文件的额外功能(我个人使用它来复制并粘贴到FreeMind上,以对结构,因此我使用制表符而不是空格进行缩进):

import os

def list_files(startpath):

    with open("folder_structure.txt", "w") as f_output:
        for root, dirs, files in os.walk(startpath):
            level = root.replace(startpath, '').count(os.sep)
            indent = '\t' * 1 * (level)
            output_string = '{}{}/'.format(indent, os.path.basename(root))
            print(output_string)
            f_output.write(output_string + '\n')
            subindent = '\t' * 1 * (level + 1)
            for f in files:
                output_string = '{}{}'.format(subindent, f)
                print(output_string)
                f_output.write(output_string + '\n')

list_files(".")

On top of dhobbs answer above (https://stackoverflow.com/a/9728478/624597), here is an extra functionality of storing results to a file (I personally use it to copy and paste to FreeMind to have a nice overview of the structure, therefore I used tabs instead of spaces for indentation):

import os

def list_files(startpath):

    with open("folder_structure.txt", "w") as f_output:
        for root, dirs, files in os.walk(startpath):
            level = root.replace(startpath, '').count(os.sep)
            indent = '\t' * 1 * (level)
            output_string = '{}{}/'.format(indent, os.path.basename(root))
            print(output_string)
            f_output.write(output_string + '\n')
            subindent = '\t' * 1 * (level + 1)
            for f in files:
                output_string = '{}{}'.format(subindent, f)
                print(output_string)
                f_output.write(output_string + '\n')

list_files(".")

回答 8

您可以执行Linux Shell的“ tree”命令。

安装:

   ~$sudo apt install tree

在python中使用

    >>> import os
    >>> os.system('tree <desired path>')

例:

    >>> os.system('tree ~/Desktop/myproject')

这样可以使您的结构更整洁,并且在外观上更全面,更易于键入。

You can execute ‘tree’ command of Linux shell.

Installation:

   ~$sudo apt install tree

Using in python

    >>> import os
    >>> os.system('tree <desired path>')

Example:

    >>> os.system('tree ~/Desktop/myproject')

This gives you a cleaner structure and is visually more comprehensive and easy to type.


回答 9

仅当您已tree在系统上安装此解决方案。但是,我将其保留在此处,以防万一它可以帮助其他人。

您可以告诉tree将树结构输出为XML(tree -X)或JSON(tree -J)。JSON当然可以直接用python解析,而XML可以轻松地使用读取lxml

以以下目录结构为例:

[sri@localhost Projects]$ tree --charset=ascii bands
bands
|-- DreamTroll
|   |-- MattBaldwinson
|   |-- members.txt
|   |-- PaulCarter
|   |-- SimonBlakelock
|   `-- Rob Stringer
|-- KingsX
|   |-- DougPinnick
|   |-- JerryGaskill
|   |-- members.txt
|   `-- TyTabor
|-- Megadeth
|   |-- DaveMustaine
|   |-- DavidEllefson
|   |-- DirkVerbeuren
|   |-- KikoLoureiro
|   `-- members.txt
|-- Nightwish
|   |-- EmppuVuorinen
|   |-- FloorJansen
|   |-- JukkaNevalainen
|   |-- MarcoHietala
|   |-- members.txt
|   |-- TroyDonockley
|   `-- TuomasHolopainen
`-- Rush
    |-- AlexLifeson
    |-- GeddyLee
    `-- NeilPeart

5 directories, 25 files

XML格式

<?xml version="1.0" encoding="UTF-8"?>
<tree>
  <directory name="bands">
    <directory name="DreamTroll">
      <file name="MattBaldwinson"></file>
      <file name="members.txt"></file>
      <file name="PaulCarter"></file>
      <file name="RobStringer"></file>
      <file name="SimonBlakelock"></file>
    </directory>
    <directory name="KingsX">
      <file name="DougPinnick"></file>
      <file name="JerryGaskill"></file>
      <file name="members.txt"></file>
      <file name="TyTabor"></file>
    </directory>
    <directory name="Megadeth">
      <file name="DaveMustaine"></file>
      <file name="DavidEllefson"></file>
      <file name="DirkVerbeuren"></file>
      <file name="KikoLoureiro"></file>
      <file name="members.txt"></file>
    </directory>
    <directory name="Nightwish">
      <file name="EmppuVuorinen"></file>
      <file name="FloorJansen"></file>
      <file name="JukkaNevalainen"></file>
      <file name="MarcoHietala"></file>
      <file name="members.txt"></file>
      <file name="TroyDonockley"></file>
      <file name="TuomasHolopainen"></file>
    </directory>
    <directory name="Rush">
      <file name="AlexLifeson"></file>
      <file name="GeddyLee"></file>
      <file name="NeilPeart"></file>
    </directory>
  </directory>
  <report>
    <directories>5</directories>
    <files>25</files>
  </report>
</tree>

JSON格式

[sri@localhost Projects]$ tree -J bands
[
  {"type":"directory","name":"bands","contents":[
    {"type":"directory","name":"DreamTroll","contents":[
      {"type":"file","name":"MattBaldwinson"},
      {"type":"file","name":"members.txt"},
      {"type":"file","name":"PaulCarter"},
      {"type":"file","name":"RobStringer"},
      {"type":"file","name":"SimonBlakelock"}
    ]},
    {"type":"directory","name":"KingsX","contents":[
      {"type":"file","name":"DougPinnick"},
      {"type":"file","name":"JerryGaskill"},
      {"type":"file","name":"members.txt"},
      {"type":"file","name":"TyTabor"}
    ]},
    {"type":"directory","name":"Megadeth","contents":[
      {"type":"file","name":"DaveMustaine"},
      {"type":"file","name":"DavidEllefson"},
      {"type":"file","name":"DirkVerbeuren"},
      {"type":"file","name":"KikoLoureiro"},
      {"type":"file","name":"members.txt"}
    ]},
    {"type":"directory","name":"Nightwish","contents":[
      {"type":"file","name":"EmppuVuorinen"},
      {"type":"file","name":"FloorJansen"},
      {"type":"file","name":"JukkaNevalainen"},
      {"type":"file","name":"MarcoHietala"},
      {"type":"file","name":"members.txt"},
      {"type":"file","name":"TroyDonockley"},
      {"type":"file","name":"TuomasHolopainen"}
    ]},
    {"type":"directory","name":"Rush","contents":[
      {"type":"file","name":"AlexLifeson"},
      {"type":"file","name":"GeddyLee"},
      {"type":"file","name":"NeilPeart"}
    ]}
  ]},
  {"type":"report","directories":5,"files":25}
]

This solution will only work if you have tree installed on your system. However I’m leaving this solution here just in case it helps someone else out.

You can tell tree to output the tree structure as XML (tree -X) or JSON (tree -J). JSON of course can be parsed directly with python and XML can easily be read with lxml.

With the following directory structure as an example:

[sri@localhost Projects]$ tree --charset=ascii bands
bands
|-- DreamTroll
|   |-- MattBaldwinson
|   |-- members.txt
|   |-- PaulCarter
|   |-- SimonBlakelock
|   `-- Rob Stringer
|-- KingsX
|   |-- DougPinnick
|   |-- JerryGaskill
|   |-- members.txt
|   `-- TyTabor
|-- Megadeth
|   |-- DaveMustaine
|   |-- DavidEllefson
|   |-- DirkVerbeuren
|   |-- KikoLoureiro
|   `-- members.txt
|-- Nightwish
|   |-- EmppuVuorinen
|   |-- FloorJansen
|   |-- JukkaNevalainen
|   |-- MarcoHietala
|   |-- members.txt
|   |-- TroyDonockley
|   `-- TuomasHolopainen
`-- Rush
    |-- AlexLifeson
    |-- GeddyLee
    `-- NeilPeart

5 directories, 25 files

XML

<?xml version="1.0" encoding="UTF-8"?>
<tree>
  <directory name="bands">
    <directory name="DreamTroll">
      <file name="MattBaldwinson"></file>
      <file name="members.txt"></file>
      <file name="PaulCarter"></file>
      <file name="RobStringer"></file>
      <file name="SimonBlakelock"></file>
    </directory>
    <directory name="KingsX">
      <file name="DougPinnick"></file>
      <file name="JerryGaskill"></file>
      <file name="members.txt"></file>
      <file name="TyTabor"></file>
    </directory>
    <directory name="Megadeth">
      <file name="DaveMustaine"></file>
      <file name="DavidEllefson"></file>
      <file name="DirkVerbeuren"></file>
      <file name="KikoLoureiro"></file>
      <file name="members.txt"></file>
    </directory>
    <directory name="Nightwish">
      <file name="EmppuVuorinen"></file>
      <file name="FloorJansen"></file>
      <file name="JukkaNevalainen"></file>
      <file name="MarcoHietala"></file>
      <file name="members.txt"></file>
      <file name="TroyDonockley"></file>
      <file name="TuomasHolopainen"></file>
    </directory>
    <directory name="Rush">
      <file name="AlexLifeson"></file>
      <file name="GeddyLee"></file>
      <file name="NeilPeart"></file>
    </directory>
  </directory>
  <report>
    <directories>5</directories>
    <files>25</files>
  </report>
</tree>

JSON

[sri@localhost Projects]$ tree -J bands
[
  {"type":"directory","name":"bands","contents":[
    {"type":"directory","name":"DreamTroll","contents":[
      {"type":"file","name":"MattBaldwinson"},
      {"type":"file","name":"members.txt"},
      {"type":"file","name":"PaulCarter"},
      {"type":"file","name":"RobStringer"},
      {"type":"file","name":"SimonBlakelock"}
    ]},
    {"type":"directory","name":"KingsX","contents":[
      {"type":"file","name":"DougPinnick"},
      {"type":"file","name":"JerryGaskill"},
      {"type":"file","name":"members.txt"},
      {"type":"file","name":"TyTabor"}
    ]},
    {"type":"directory","name":"Megadeth","contents":[
      {"type":"file","name":"DaveMustaine"},
      {"type":"file","name":"DavidEllefson"},
      {"type":"file","name":"DirkVerbeuren"},
      {"type":"file","name":"KikoLoureiro"},
      {"type":"file","name":"members.txt"}
    ]},
    {"type":"directory","name":"Nightwish","contents":[
      {"type":"file","name":"EmppuVuorinen"},
      {"type":"file","name":"FloorJansen"},
      {"type":"file","name":"JukkaNevalainen"},
      {"type":"file","name":"MarcoHietala"},
      {"type":"file","name":"members.txt"},
      {"type":"file","name":"TroyDonockley"},
      {"type":"file","name":"TuomasHolopainen"}
    ]},
    {"type":"directory","name":"Rush","contents":[
      {"type":"file","name":"AlexLifeson"},
      {"type":"file","name":"GeddyLee"},
      {"type":"file","name":"NeilPeart"}
    ]}
  ]},
  {"type":"report","directories":5,"files":25}
]

回答 10

也许比@ellockie(也许)快

导入操作系统
def file_writer(文字):
    使用open(“ folder_structure.txt”,“ a”)作为f_output:
        f_output.write(文本)
def list_files(起始路径):


    用于os.walk(startpath)中的根目录,dirs文件:
        级别= root.replace(startpath,'').count(os.sep)
        缩进='\ t'* 1 *(级别)
        output_string ='{} {} / \ n'.format(indent,os.path.basename(root))
        file_writer(输出字符串)
        subindent ='\ t'* 1 *(级别+1)
        output_string ='%s%s \ n'%(subindent,[f用于文件中的f])
        file_writer(''。join(output_string))


list_files(“ /”)

测试结果在以下屏幕截图中:

Maybe faster than @ellockie ( Maybe )

import os
def file_writer(text):
    with open("folder_structure.txt","a") as f_output:
        f_output.write(text)
def list_files(startpath):


    for root, dirs, files in os.walk(startpath):
        level = root.replace(startpath, '').count(os.sep)
        indent = '\t' * 1 * (level)
        output_string = '{}{}/ \n'.format(indent, os.path.basename(root))
        file_writer(output_string)
        subindent = '\t' * 1 * (level + 1)
        output_string = '%s %s \n' %(subindent,[f for f in files])
        file_writer(''.join(output_string))


list_files("/")

Test results in screenshot below:


回答 11

在这里您可以找到具有以下输出的代码:https : //stackoverflow.com/a/56622847/6671330

V .
|-> V folder1
|   |-> V folder2
|   |   |-> V folder3
|   |   |   |-> file3.txt
|   |   |-> file2.txt
|   |-> V folderX
|   |-> file1.txt
|-> 02-hw1_wdwwfm.py
|-> 06-t1-home1.py
|-> 06-t1-home2.py
|-> hw1.py

Here you can find code with output like this: https://stackoverflow.com/a/56622847/6671330

V .
|-> V folder1
|   |-> V folder2
|   |   |-> V folder3
|   |   |   |-> file3.txt
|   |   |-> file2.txt
|   |-> V folderX
|   |-> file1.txt
|-> 02-hw1_wdwwfm.py
|-> 06-t1-home1.py
|-> 06-t1-home2.py
|-> hw1.py

回答 12

对于那些仍在寻找答案的人。这是一种获取字典中路径的递归方法。

import os


def list_files(startpath):
    for root, dirs, files in os.walk(startpath):
        dir_content = []
        for dir in dirs:
            go_inside = os.path.join(startpath, dir)
            dir_content.append(list_files(go_inside))
        files_lst = []
        for f in files:
            files_lst.append(f)
        return {'name': root, 'files': files_lst, 'dirs': dir_content}

For those who are still looking for an answer. Here is a recursive approach to get the paths in a dictionary.

import os


def list_files(startpath):
    for root, dirs, files in os.walk(startpath):
        dir_content = []
        for dir in dirs:
            go_inside = os.path.join(startpath, dir)
            dir_content.append(list_files(go_inside))
        files_lst = []
        for f in files:
            files_lst.append(f)
        return {'name': root, 'files': files_lst, 'dirs': dir_content}

回答 13

@dhobbs的答案很棒!

但只需更改即可轻松获得关卡信息

def print_list_dir(dir):
    print("=" * 64)
    print("[PRINT LIST DIR] %s" % dir)
    print("=" * 64)
    for root, dirs, files in os.walk(dir):
        level = root.replace(dir, '').count(os.sep)
        indent = '| ' * level
        print('{}{} \\'.format(indent, os.path.basename(root)))
        subindent = '| ' * (level + 1)
        for f in files:
            print('{}{}'.format(subindent, f))
    print("=" * 64)

和输出像

================================================================
[PRINT LIST DIR] ./
================================================================
 \
| os_name.py
| json_loads.py
| linspace_python.py
| list_file.py
| to_gson_format.py
| type_convert_test.py
| in_and_replace_test.py
| online_log.py
| padding_and_clipping.py
| str_tuple.py
| set_test.py
| script_name.py
| word_count.py
| get14.py
| np_test2.py
================================================================

您可以通过|计数获得级别!

@dhobbs’s answer is great!

but simply change to easy get the level info

def print_list_dir(dir):
    print("=" * 64)
    print("[PRINT LIST DIR] %s" % dir)
    print("=" * 64)
    for root, dirs, files in os.walk(dir):
        level = root.replace(dir, '').count(os.sep)
        indent = '| ' * level
        print('{}{} \\'.format(indent, os.path.basename(root)))
        subindent = '| ' * (level + 1)
        for f in files:
            print('{}{}'.format(subindent, f))
    print("=" * 64)

and the output like

================================================================
[PRINT LIST DIR] ./
================================================================
 \
| os_name.py
| json_loads.py
| linspace_python.py
| list_file.py
| to_gson_format.py
| type_convert_test.py
| in_and_replace_test.py
| online_log.py
| padding_and_clipping.py
| str_tuple.py
| set_test.py
| script_name.py
| word_count.py
| get14.py
| np_test2.py
================================================================

you can get the level by | count!


Python应用程序的最佳项目结构是什么?[关闭]

问题:Python应用程序的最佳项目结构是什么?[关闭]

想象一下,您想使用Python开发非平凡的最终用户桌面(非Web)应用程序。构造项目文件夹层次结构的最佳方法是什么?

理想的功能是易于维护,IDE友好,适用于源代码控制分支/合并以及易于生成安装软件包。

特别是:

  1. 您将源放在哪里?
  2. 您将应用程序启动脚本放在哪里?
  3. 您将IDE项目放在哪里?
  4. 您将单元/验收测试放在哪里?
  5. 您将非Python数据(例如配置文件)放在哪里?
  6. 您在哪里将非Python来源(例如C ++)用于pyd / so二进制扩展模块?

Imagine that you want to develop a non-trivial end-user desktop (not web) application in Python. What is the best way to structure the project’s folder hierarchy?

Desirable features are ease of maintenance, IDE-friendliness, suitability for source control branching/merging, and easy generation of install packages.

In particular:

  1. Where do you put the source?
  2. Where do you put application startup scripts?
  3. Where do you put the IDE project cruft?
  4. Where do you put the unit/acceptance tests?
  5. Where do you put non-Python data such as config files?
  6. Where do you put non-Python sources such as C++ for pyd/so binary extension modules?

回答 0

没什么大不了的。令您快乐的一切都会起作用。没有很多愚蠢的规则,因为Python项目可以很简单。

  • /scripts/bin那种命令行界面的东西
  • /tests 为您的测试
  • /lib 用于您的C语言库
  • /doc 对于大多数文档
  • /apidoc 用于Epydoc生成的API文档。

顶级目录可以包含自述文件,配置文件和其他内容。

困难的选择是是否使用/src树。Python没有区别/src/lib/bin如Java或C具有。

由于/src某些人认为顶层目录没有意义,因此顶层目录可以是应用程序的顶层体系结构。

  • /foo
  • /bar
  • /baz

我建议将所有这些都放在“我的产品名称”目录下。因此,如果您正在编写名为的应用程序quux,则包含所有这些内容的目录将命名为 /quux

这样,另一个项目PYTHONPATH可以包括/path/to/quux/foo重用QUUX.foo模块。

就我而言,由于我使用Komodo Edit,所以我的IDE cuft是单个.KPF文件。实际上,我将其放在顶层/quux目录中,并省略了将其添加到SVN中的情况。

Doesn’t too much matter. Whatever makes you happy will work. There aren’t a lot of silly rules because Python projects can be simple.

  • /scripts or /bin for that kind of command-line interface stuff
  • /tests for your tests
  • /lib for your C-language libraries
  • /doc for most documentation
  • /apidoc for the Epydoc-generated API docs.

And the top-level directory can contain README’s, Config’s and whatnot.

The hard choice is whether or not to use a /src tree. Python doesn’t have a distinction between /src, /lib, and /bin like Java or C has.

Since a top-level /src directory is seen by some as meaningless, your top-level directory can be the top-level architecture of your application.

  • /foo
  • /bar
  • /baz

I recommend putting all of this under the “name-of-my-product” directory. So, if you’re writing an application named quux, the directory that contains all this stuff is named /quux.

Another project’s PYTHONPATH, then, can include /path/to/quux/foo to reuse the QUUX.foo module.

In my case, since I use Komodo Edit, my IDE cuft is a single .KPF file. I actually put that in the top-level /quux directory, and omit adding it to SVN.


回答 1

根据Jean-Paul Calderone的Python项目文件系统结构

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README

According to Jean-Paul Calderone’s Filesystem structure of a Python project:

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README

回答 2

博客由让-保罗·Calderone的岗位如Freenode上的#python答案通常是给出。

Python项目的文件系统结构

做:

  • 为目录命名与您的项目相关的名称。例如,如果您的项目名为“ Twisted”,请为其源文件命名顶级目录Twisted。发行时,应包括版本号后缀:Twisted-2.5
  • 创建目录Twisted/bin,然后将可执行文件放在此处(如果有)。.py即使它们是Python源文件,也不要给它们扩展名。除了在项目中其他地方定义的main函数的导入和调用外,不要在其中添加任何代码。(略有起皱:由于在Windows上,解释器是由文件扩展名选择的,因此Windows用户实际上确实希望使用.py扩展名。因此,在为Windows打包时,可能需要添加它。不幸的是,没有简单的distutils技巧可以考虑到在POSIX上.py扩展名只是一个疣,而在Windows上缺少是一个实际的错误,如果您的用户群包括Windows用户,则可能希望仅使用.py。扩展到处。)
  • 如果您的项目可表示为单个Python源文件,则将其放入目录并命名与项目相关的名称。例如,Twisted/twisted.py。如果需要多个源文件,请创建一个包(Twisted/twisted/,带一个空Twisted/twisted/__init__.py),然后将源文件放入其中。例如,Twisted/twisted/internet.py
  • 将单元测试放在程序包的子包中(请注意-这意味着上面的单个Python源文件选项是一个技巧- 单元测试始终需要至少一个其他文件)。例如,Twisted/twisted/test/。当然,请使用将其打包Twisted/twisted/test/__init__.py。将测试放在的文件中Twisted/twisted/test/test_internet.py
  • 如果感觉不错,分别添加Twisted/READMETwisted/setup.py来解释和安装软件。

别:

  • 将您的源代码放在一个名为src或的目录中lib。这使得不安装就很难运行。
  • 将测试放到Python包之外。这使得很难针对已安装的版本运行测试。
  • 创建一个包,只有拥有__init__.py,然后把所有的代码放入__init__.py。只需制作一个模块而不是一个包,就更简单了。
  • 尝试提出一些神奇的技巧,以使Python能够导入您的模块或包,而无需用户将包含它的目录添加到其导入路径(通过PYTHONPATH或其他机制)。您将无法正确处理所有情况,并且当您的软件无法在其环境中运行时,用户会生您的气。

This blog post by Jean-Paul Calderone is commonly given as an answer in #python on Freenode.

Filesystem structure of a Python project

Do:

  • name the directory something related to your project. For example, if your project is named “Twisted”, name the top-level directory for its source files Twisted. When you do releases, you should include a version number suffix: Twisted-2.5.
  • create a directory Twisted/bin and put your executables there, if you have any. Don’t give them a .py extension, even if they are Python source files. Don’t put any code in them except an import of and call to a main function defined somewhere else in your projects. (Slight wrinkle: since on Windows, the interpreter is selected by the file extension, your Windows users actually do want the .py extension. So, when you package for Windows, you may want to add it. Unfortunately there’s no easy distutils trick that I know of to automate this process. Considering that on POSIX the .py extension is a only a wart, whereas on Windows the lack is an actual bug, if your userbase includes Windows users, you may want to opt to just have the .py extension everywhere.)
  • If your project is expressable as a single Python source file, then put it into the directory and name it something related to your project. For example, Twisted/twisted.py. If you need multiple source files, create a package instead (Twisted/twisted/, with an empty Twisted/twisted/__init__.py) and place your source files in it. For example, Twisted/twisted/internet.py.
  • put your unit tests in a sub-package of your package (note – this means that the single Python source file option above was a trick – you always need at least one other file for your unit tests). For example, Twisted/twisted/test/. Of course, make it a package with Twisted/twisted/test/__init__.py. Place tests in files like Twisted/twisted/test/test_internet.py.
  • add Twisted/README and Twisted/setup.py to explain and install your software, respectively, if you’re feeling nice.

Don’t:

  • put your source in a directory called src or lib. This makes it hard to run without installing.
  • put your tests outside of your Python package. This makes it hard to run the tests against an installed version.
  • create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module instead of a package, it’s simpler.
  • try to come up with magical hacks to make Python able to import your module or package without having the user add the directory containing it to their import path (either via PYTHONPATH or some other mechanism). You will not correctly handle all cases and users will get angry at you when your software doesn’t work in their environment.

回答 3

以正确的方式查看Open Sourcing Python项目

让我摘录那篇优秀文章的项目布局部分:

设置项目时,布局(或目录结构)对于正确设置很重要。合理的布局意味着潜在的贡献者不必花大量的时间寻找代码。文件位置很直观。由于我们正在处理现有项目,因此这意味着您可能需要移动一些内容。

让我们从顶部开始。大多数项目都有许多顶级文件(例如setup.py,README.md,requirements.txt等)。每个项目应具有三个目录:

  • 包含项目文档的docs目录
  • 以项目名称命名的目录,用于存储实际的Python包
  • 在两个位置之一中的测试目录
    • 在包含测试代码和资源的包目录下
    • 作为独立的顶层目录为了更好地了解文件的组织方式,以下是我的一个项目sandman的布局简化快照:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

如您所见,这里有一些顶级文件,一个docs目录(生成的是一个空目录,sphinx将在其中放置生成的文档),一个sandman目录和一个sandman下的test目录。

Check out Open Sourcing a Python Project the Right Way.

Let me excerpt the project layout part of that excellent article:

When setting up a project, the layout (or directory structure) is important to get right. A sensible layout means that potential contributors don’t have to spend forever hunting for a piece of code; file locations are intuitive. Since we’re dealing with an existing project, it means you’ll probably need to move some stuff around.

Let’s start at the top. Most projects have a number of top-level files (like setup.py, README.md, requirements.txt, etc). There are then three directories that every project should have:

  • A docs directory containing project documentation
  • A directory named with the project’s name which stores the actual Python package
  • A test directory in one of two places
    • Under the package directory containing test code and resources
    • As a stand-alone top level directory To get a better sense of how your files should be organized, here’s a simplified snapshot of the layout for one of my projects, sandman:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

As you can see, there are some top level files, a docs directory (generated is an empty directory where sphinx will put the generated documentation), a sandman directory, and a test directory under sandman.


回答 4

“ Python包装管理中心”有一个示例项目:

https://github.com/pypa/sampleproject

它是一个示例项目,可作为《 Python打包用户指南》中有关打包和分发项目的教程的辅助工具而存在。

The “Python Packaging Authority” has a sampleproject:

https://github.com/pypa/sampleproject

It is a sample project that exists as an aid to the Python Packaging User Guide’s Tutorial on Packaging and Distributing Projects.


回答 5

尝试使用python_boilerplate模板启动项目。它在很大程度上遵循了最佳实践(例如此处的),但是如果您发现自己愿意在某个时候将您的项目分成多个鸡蛋(并且相信我,除了最简单的项目之外的其他项目,您会做到),它会更适合。常见的情况是您必须使用其他人的库的本地修​​改版本)。

  • 您将源放在哪里?

    • 对于大型项目,将源分成几个鸡蛋是有意义的。每个鸡蛋将在下作为单独的setuptools-layout放置PROJECT_ROOT/src/<egg_name>
  • 您将应用程序启动脚本放在哪里?

    • 理想的选择是将应用程序启动脚本注册为entry_point其中一个鸡蛋。
  • 您将IDE项目放在哪里?

    • 取决于IDE。他们中的许多人将自己的东西保存PROJECT_ROOT/.<something>在项目的根目录中,这很好。
  • 您将单元/验收测试放在哪里?

    • 每个鸡蛋都有单独的一组测试,并保存在其PROJECT_ROOT/src/<egg_name>/tests目录中。我个人更喜欢使用py.test它们来运行它们。
  • 您将非Python数据(例如配置文件)放在哪里?

    • 这取决于。可能有不同类型的非Python数据。
      • “资源”,即必须包装在一个鸡蛋中的数据。该数据进入包命名空间中某个位置的相应egg目录。可以通过pkg_resources从中的包使用它,也可以从标准库中setuptoolsimportlib.resources模块通过Python 3.7开始使用。
      • “配置文件”,即非Python文件,它们被视为项目源文件的外部文件,但在应用程序开始运行时必须使用一些值进行初始化。在开发过程中,我更喜欢将此类文件保存在中PROJECT_ROOT/config。对于部署,可以有多种选择。在Windows %APP_DATA%/<app-name>/config上,可以在Linux /etc/<app-name>或上使用/opt/<app-name>/config
      • 生成的文件,即应用程序在执行期间可以创建或修改的文件。我希望PROJECT_ROOT/var在开发/var期间以及在Linux部署期间保留它们。
  • 您在哪里将非Python来源(例如C ++)用于pyd / so二进制扩展模块?
    • 进入 PROJECT_ROOT/src/<egg_name>/native

文件通常会放入PROJECT_ROOT/docPROJECT_ROOT/src/<egg_name>/doc(取决于您是否将某些鸡蛋视为一个单独的大型项目)。一些其他配置将在PROJECT_ROOT/buildout.cfg和文件中PROJECT_ROOT/setup.cfg

Try starting the project using the python_boilerplate template. It largely follows the best practices (e.g. those here), but is better suited in case you find yourself willing to split your project into more than one egg at some point (and believe me, with anything but the simplest projects, you will. One common situation is where you have to use a locally-modified version of someone else’s library).

  • Where do you put the source?

    • For decently large projects it makes sense to split the source into several eggs. Each egg would go as a separate setuptools-layout under PROJECT_ROOT/src/<egg_name>.
  • Where do you put application startup scripts?

    • The ideal option is to have application startup script registered as an entry_point in one of the eggs.
  • Where do you put the IDE project cruft?

    • Depends on the IDE. Many of them keep their stuff in PROJECT_ROOT/.<something> in the root of the project, and this is fine.
  • Where do you put the unit/acceptance tests?

    • Each egg has a separate set of tests, kept in its PROJECT_ROOT/src/<egg_name>/tests directory. I personally prefer to use py.test to run them.
  • Where do you put non-Python data such as config files?

    • It depends. There can be different types of non-Python data.
      • “Resources”, i.e. data that must be packaged within an egg. This data goes into the corresponding egg directory, somewhere within package namespace. It can be used via the pkg_resources package from setuptools, or since Python 3.7 via the importlib.resources module from the standard library.
      • “Config-files”, i.e. non-Python files that are to be regarded as external to the project source files, but have to be initialized with some values when application starts running. During development I prefer to keep such files in PROJECT_ROOT/config. For deployment there can be various options. On Windows one can use %APP_DATA%/<app-name>/config, on Linux, /etc/<app-name> or /opt/<app-name>/config.
      • Generated files, i.e. files that may be created or modified by the application during execution. I would prefer to keep them in PROJECT_ROOT/var during development, and under /var during Linux deployment.
  • Where do you put non-Python sources such as C++ for pyd/so binary extension modules?
    • Into PROJECT_ROOT/src/<egg_name>/native

Documentation would typically go into PROJECT_ROOT/doc or PROJECT_ROOT/src/<egg_name>/doc (this depends on whether you regard some of the eggs to be a separate large projects). Some additional configuration will be in files like PROJECT_ROOT/buildout.cfg and PROJECT_ROOT/setup.cfg.


回答 6

以我的经验,这只是迭代问题。将您的数据和代码放在您认为任何地方。很有可能,无论如何你都会错的。但是,一旦您对事物的确切形状有了一个更好的了解,您就可以进行这些猜测。

至于扩展源,我们在主干下有一个Code目录,其中包含python目录和各种其他语言的目录。就个人而言,下一次我更倾向于尝试将任何扩展代码放入其自己的存储库中。

话虽如此,我回到了我的初始观点:不要做太大的事情。将其放在似乎对您有用的位置。如果发现不起作用,则可以(并且应该)对其进行更改。

In my experience, it’s just a matter of iteration. Put your data and code wherever you think they go. Chances are, you’ll be wrong anyway. But once you get a better idea of exactly how things are going to shape up, you’re in a much better position to make these kinds of guesses.

As far as extension sources, we have a Code directory under trunk that contains a directory for python and a directory for various other languages. Personally, I’m more inclined to try putting any extension code into its own repository next time around.

With that said, I go back to my initial point: don’t make too big a deal out of it. Put it somewhere that seems to work for you. If you find something that doesn’t work, it can (and should) be changed.


回答 7

最好使用setuptools中package_data支持将非Python数据捆绑到您的Python模块中。我强烈建议您使用命名空间包来创建多个项目可以使用的共享命名空间,这很像Java约定(将软件包放入其中并能够拥有一个共享命名空间)。com.yourcompany.yourprojectcom.yourcompany.utils

重新分支和合并,如果您使用足够好的源代码控制系统,它将通过重命名来处理合并;集市在这方面尤其擅长。

与这里的其他答案相反,我对拥有src顶级目录(带有doctest目录并在旁边)+1 。文档目录树的特定约定将根据您所使用的内容而有所不同。例如,Sphinx有其快速启动工具支持的自己的约定。

请,请利用setuptools和pkg_resources;这使其他项目更容易依赖于代码的特定版本(如果使用,则多个版本可以与不同的非代码文件同时安装package_data)。

Non-python data is best bundled inside your Python modules using the package_data support in setuptools. One thing I strongly recommend is using namespace packages to create shared namespaces which multiple projects can use — much like the Java convention of putting packages in com.yourcompany.yourproject (and being able to have a shared com.yourcompany.utils namespace).

Re branching and merging, if you use a good enough source control system it will handle merges even through renames; Bazaar is particularly good at this.

Contrary to some other answers here, I’m +1 on having a src directory top-level (with doc and test directories alongside). Specific conventions for documentation directory trees will vary depending on what you’re using; Sphinx, for instance, has its own conventions which its quickstart tool supports.

Please, please leverage setuptools and pkg_resources; this makes it much easier for other projects to rely on specific versions of your code (and for multiple versions to be simultaneously installed with different non-code files, if you’re using package_data).