Python应用程序的最佳项目结构是什么?[关闭]

问题:Python应用程序的最佳项目结构是什么?[关闭]

想象一下,您想使用Python开发非平凡的最终用户桌面(非Web)应用程序。构造项目文件夹层次结构的最佳方法是什么?

理想的功能是易于维护,IDE友好,适用于源代码控制分支/合并以及易于生成安装软件包。

特别是:

  1. 您将源放在哪里?
  2. 您将应用程序启动脚本放在哪里?
  3. 您将IDE项目放在哪里?
  4. 您将单元/验收测试放在哪里?
  5. 您将非Python数据(例如配置文件)放在哪里?
  6. 您在哪里将非Python来源(例如C ++)用于pyd / so二进制扩展模块?

Imagine that you want to develop a non-trivial end-user desktop (not web) application in Python. What is the best way to structure the project’s folder hierarchy?

Desirable features are ease of maintenance, IDE-friendliness, suitability for source control branching/merging, and easy generation of install packages.

In particular:

  1. Where do you put the source?
  2. Where do you put application startup scripts?
  3. Where do you put the IDE project cruft?
  4. Where do you put the unit/acceptance tests?
  5. Where do you put non-Python data such as config files?
  6. Where do you put non-Python sources such as C++ for pyd/so binary extension modules?

回答 0

没什么大不了的。令您快乐的一切都会起作用。没有很多愚蠢的规则,因为Python项目可以很简单。

  • /scripts/bin那种命令行界面的东西
  • /tests 为您的测试
  • /lib 用于您的C语言库
  • /doc 对于大多数文档
  • /apidoc 用于Epydoc生成的API文档。

顶级目录可以包含自述文件,配置文件和其他内容。

困难的选择是是否使用/src树。Python没有区别/src/lib/bin如Java或C具有。

由于/src某些人认为顶层目录没有意义,因此顶层目录可以是应用程序的顶层体系结构。

  • /foo
  • /bar
  • /baz

我建议将所有这些都放在“我的产品名称”目录下。因此,如果您正在编写名为的应用程序quux,则包含所有这些内容的目录将命名为 /quux

这样,另一个项目PYTHONPATH可以包括/path/to/quux/foo重用QUUX.foo模块。

就我而言,由于我使用Komodo Edit,所以我的IDE cuft是单个.KPF文件。实际上,我将其放在顶层/quux目录中,并省略了将其添加到SVN中的情况。

Doesn’t too much matter. Whatever makes you happy will work. There aren’t a lot of silly rules because Python projects can be simple.

  • /scripts or /bin for that kind of command-line interface stuff
  • /tests for your tests
  • /lib for your C-language libraries
  • /doc for most documentation
  • /apidoc for the Epydoc-generated API docs.

And the top-level directory can contain README’s, Config’s and whatnot.

The hard choice is whether or not to use a /src tree. Python doesn’t have a distinction between /src, /lib, and /bin like Java or C has.

Since a top-level /src directory is seen by some as meaningless, your top-level directory can be the top-level architecture of your application.

  • /foo
  • /bar
  • /baz

I recommend putting all of this under the “name-of-my-product” directory. So, if you’re writing an application named quux, the directory that contains all this stuff is named /quux.

Another project’s PYTHONPATH, then, can include /path/to/quux/foo to reuse the QUUX.foo module.

In my case, since I use Komodo Edit, my IDE cuft is a single .KPF file. I actually put that in the top-level /quux directory, and omit adding it to SVN.


回答 1

根据Jean-Paul Calderone的Python项目文件系统结构

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README

According to Jean-Paul Calderone’s Filesystem structure of a Python project:

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README

回答 2

博客由让-保罗·Calderone的岗位如Freenode上的#python答案通常是给出。

Python项目的文件系统结构

做:

  • 为目录命名与您的项目相关的名称。例如,如果您的项目名为“ Twisted”,请为其源文件命名顶级目录Twisted。发行时,应包括版本号后缀:Twisted-2.5
  • 创建目录Twisted/bin,然后将可执行文件放在此处(如果有)。.py即使它们是Python源文件,也不要给它们扩展名。除了在项目中其他地方定义的main函数的导入和调用外,不要在其中添加任何代码。(略有起皱:由于在Windows上,解释器是由文件扩展名选择的,因此Windows用户实际上确实希望使用.py扩展名。因此,在为Windows打包时,可能需要添加它。不幸的是,没有简单的distutils技巧可以考虑到在POSIX上.py扩展名只是一个疣,而在Windows上缺少是一个实际的错误,如果您的用户群包括Windows用户,则可能希望仅使用.py。扩展到处。)
  • 如果您的项目可表示为单个Python源文件,则将其放入目录并命名与项目相关的名称。例如,Twisted/twisted.py。如果需要多个源文件,请创建一个包(Twisted/twisted/,带一个空Twisted/twisted/__init__.py),然后将源文件放入其中。例如,Twisted/twisted/internet.py
  • 将单元测试放在程序包的子包中(请注意-这意味着上面的单个Python源文件选项是一个技巧- 单元测试始终需要至少一个其他文件)。例如,Twisted/twisted/test/。当然,请使用将其打包Twisted/twisted/test/__init__.py。将测试放在的文件中Twisted/twisted/test/test_internet.py
  • 如果感觉不错,分别添加Twisted/READMETwisted/setup.py来解释和安装软件。

别:

  • 将您的源代码放在一个名为src或的目录中lib。这使得不安装就很难运行。
  • 将测试放到Python包之外。这使得很难针对已安装的版本运行测试。
  • 创建一个包,只有拥有__init__.py,然后把所有的代码放入__init__.py。只需制作一个模块而不是一个包,就更简单了。
  • 尝试提出一些神奇的技巧,以使Python能够导入您的模块或包,而无需用户将包含它的目录添加到其导入路径(通过PYTHONPATH或其他机制)。您将无法正确处理所有情况,并且当您的软件无法在其环境中运行时,用户会生您的气。

This blog post by Jean-Paul Calderone is commonly given as an answer in #python on Freenode.

Filesystem structure of a Python project

Do:

  • name the directory something related to your project. For example, if your project is named “Twisted”, name the top-level directory for its source files Twisted. When you do releases, you should include a version number suffix: Twisted-2.5.
  • create a directory Twisted/bin and put your executables there, if you have any. Don’t give them a .py extension, even if they are Python source files. Don’t put any code in them except an import of and call to a main function defined somewhere else in your projects. (Slight wrinkle: since on Windows, the interpreter is selected by the file extension, your Windows users actually do want the .py extension. So, when you package for Windows, you may want to add it. Unfortunately there’s no easy distutils trick that I know of to automate this process. Considering that on POSIX the .py extension is a only a wart, whereas on Windows the lack is an actual bug, if your userbase includes Windows users, you may want to opt to just have the .py extension everywhere.)
  • If your project is expressable as a single Python source file, then put it into the directory and name it something related to your project. For example, Twisted/twisted.py. If you need multiple source files, create a package instead (Twisted/twisted/, with an empty Twisted/twisted/__init__.py) and place your source files in it. For example, Twisted/twisted/internet.py.
  • put your unit tests in a sub-package of your package (note – this means that the single Python source file option above was a trick – you always need at least one other file for your unit tests). For example, Twisted/twisted/test/. Of course, make it a package with Twisted/twisted/test/__init__.py. Place tests in files like Twisted/twisted/test/test_internet.py.
  • add Twisted/README and Twisted/setup.py to explain and install your software, respectively, if you’re feeling nice.

Don’t:

  • put your source in a directory called src or lib. This makes it hard to run without installing.
  • put your tests outside of your Python package. This makes it hard to run the tests against an installed version.
  • create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module instead of a package, it’s simpler.
  • try to come up with magical hacks to make Python able to import your module or package without having the user add the directory containing it to their import path (either via PYTHONPATH or some other mechanism). You will not correctly handle all cases and users will get angry at you when your software doesn’t work in their environment.

回答 3

以正确的方式查看Open Sourcing Python项目

让我摘录那篇优秀文章的项目布局部分:

设置项目时,布局(或目录结构)对于正确设置很重要。合理的布局意味着潜在的贡献者不必花大量的时间寻找代码。文件位置很直观。由于我们正在处理现有项目,因此这意味着您可能需要移动一些内容。

让我们从顶部开始。大多数项目都有许多顶级文件(例如setup.py,README.md,requirements.txt等)。每个项目应具有三个目录:

  • 包含项目文档的docs目录
  • 以项目名称命名的目录,用于存储实际的Python包
  • 在两个位置之一中的测试目录
    • 在包含测试代码和资源的包目录下
    • 作为独立的顶层目录为了更好地了解文件的组织方式,以下是我的一个项目sandman的布局简化快照:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

如您所见,这里有一些顶级文件,一个docs目录(生成的是一个空目录,sphinx将在其中放置生成的文档),一个sandman目录和一个sandman下的test目录。

Check out Open Sourcing a Python Project the Right Way.

Let me excerpt the project layout part of that excellent article:

When setting up a project, the layout (or directory structure) is important to get right. A sensible layout means that potential contributors don’t have to spend forever hunting for a piece of code; file locations are intuitive. Since we’re dealing with an existing project, it means you’ll probably need to move some stuff around.

Let’s start at the top. Most projects have a number of top-level files (like setup.py, README.md, requirements.txt, etc). There are then three directories that every project should have:

  • A docs directory containing project documentation
  • A directory named with the project’s name which stores the actual Python package
  • A test directory in one of two places
    • Under the package directory containing test code and resources
    • As a stand-alone top level directory To get a better sense of how your files should be organized, here’s a simplified snapshot of the layout for one of my projects, sandman:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

As you can see, there are some top level files, a docs directory (generated is an empty directory where sphinx will put the generated documentation), a sandman directory, and a test directory under sandman.


回答 4

“ Python包装管理中心”有一个示例项目:

https://github.com/pypa/sampleproject

它是一个示例项目,可作为《 Python打包用户指南》中有关打包和分发项目的教程的辅助工具而存在。

The “Python Packaging Authority” has a sampleproject:

https://github.com/pypa/sampleproject

It is a sample project that exists as an aid to the Python Packaging User Guide’s Tutorial on Packaging and Distributing Projects.


回答 5

尝试使用python_boilerplate模板启动项目。它在很大程度上遵循了最佳实践(例如此处的),但是如果您发现自己愿意在某个时候将您的项目分成多个鸡蛋(并且相信我,除了最简单的项目之外的其他项目,您会做到),它会更适合。常见的情况是您必须使用其他人的库的本地修​​改版本)。

  • 您将源放在哪里?

    • 对于大型项目,将源分成几个鸡蛋是有意义的。每个鸡蛋将在下作为单独的setuptools-layout放置PROJECT_ROOT/src/<egg_name>
  • 您将应用程序启动脚本放在哪里?

    • 理想的选择是将应用程序启动脚本注册为entry_point其中一个鸡蛋。
  • 您将IDE项目放在哪里?

    • 取决于IDE。他们中的许多人将自己的东西保存PROJECT_ROOT/.<something>在项目的根目录中,这很好。
  • 您将单元/验收测试放在哪里?

    • 每个鸡蛋都有单独的一组测试,并保存在其PROJECT_ROOT/src/<egg_name>/tests目录中。我个人更喜欢使用py.test它们来运行它们。
  • 您将非Python数据(例如配置文件)放在哪里?

    • 这取决于。可能有不同类型的非Python数据。
      • “资源”,即必须包装在一个鸡蛋中的数据。该数据进入包命名空间中某个位置的相应egg目录。可以通过pkg_resources从中的包使用它,也可以从标准库中setuptoolsimportlib.resources模块通过Python 3.7开始使用。
      • “配置文件”,即非Python文件,它们被视为项目源文件的外部文件,但在应用程序开始运行时必须使用一些值进行初始化。在开发过程中,我更喜欢将此类文件保存在中PROJECT_ROOT/config。对于部署,可以有多种选择。在Windows %APP_DATA%/<app-name>/config上,可以在Linux /etc/<app-name>或上使用/opt/<app-name>/config
      • 生成的文件,即应用程序在执行期间可以创建或修改的文件。我希望PROJECT_ROOT/var在开发/var期间以及在Linux部署期间保留它们。
  • 您在哪里将非Python来源(例如C ++)用于pyd / so二进制扩展模块?
    • 进入 PROJECT_ROOT/src/<egg_name>/native

文件通常会放入PROJECT_ROOT/docPROJECT_ROOT/src/<egg_name>/doc(取决于您是否将某些鸡蛋视为一个单独的大型项目)。一些其他配置将在PROJECT_ROOT/buildout.cfg和文件中PROJECT_ROOT/setup.cfg

Try starting the project using the python_boilerplate template. It largely follows the best practices (e.g. those here), but is better suited in case you find yourself willing to split your project into more than one egg at some point (and believe me, with anything but the simplest projects, you will. One common situation is where you have to use a locally-modified version of someone else’s library).

  • Where do you put the source?

    • For decently large projects it makes sense to split the source into several eggs. Each egg would go as a separate setuptools-layout under PROJECT_ROOT/src/<egg_name>.
  • Where do you put application startup scripts?

    • The ideal option is to have application startup script registered as an entry_point in one of the eggs.
  • Where do you put the IDE project cruft?

    • Depends on the IDE. Many of them keep their stuff in PROJECT_ROOT/.<something> in the root of the project, and this is fine.
  • Where do you put the unit/acceptance tests?

    • Each egg has a separate set of tests, kept in its PROJECT_ROOT/src/<egg_name>/tests directory. I personally prefer to use py.test to run them.
  • Where do you put non-Python data such as config files?

    • It depends. There can be different types of non-Python data.
      • “Resources”, i.e. data that must be packaged within an egg. This data goes into the corresponding egg directory, somewhere within package namespace. It can be used via the pkg_resources package from setuptools, or since Python 3.7 via the importlib.resources module from the standard library.
      • “Config-files”, i.e. non-Python files that are to be regarded as external to the project source files, but have to be initialized with some values when application starts running. During development I prefer to keep such files in PROJECT_ROOT/config. For deployment there can be various options. On Windows one can use %APP_DATA%/<app-name>/config, on Linux, /etc/<app-name> or /opt/<app-name>/config.
      • Generated files, i.e. files that may be created or modified by the application during execution. I would prefer to keep them in PROJECT_ROOT/var during development, and under /var during Linux deployment.
  • Where do you put non-Python sources such as C++ for pyd/so binary extension modules?
    • Into PROJECT_ROOT/src/<egg_name>/native

Documentation would typically go into PROJECT_ROOT/doc or PROJECT_ROOT/src/<egg_name>/doc (this depends on whether you regard some of the eggs to be a separate large projects). Some additional configuration will be in files like PROJECT_ROOT/buildout.cfg and PROJECT_ROOT/setup.cfg.


回答 6

以我的经验,这只是迭代问题。将您的数据和代码放在您认为任何地方。很有可能,无论如何你都会错的。但是,一旦您对事物的确切形状有了一个更好的了解,您就可以进行这些猜测。

至于扩展源,我们在主干下有一个Code目录,其中包含python目录和各种其他语言的目录。就个人而言,下一次我更倾向于尝试将任何扩展代码放入其自己的存储库中。

话虽如此,我回到了我的初始观点:不要做太大的事情。将其放在似乎对您有用的位置。如果发现不起作用,则可以(并且应该)对其进行更改。

In my experience, it’s just a matter of iteration. Put your data and code wherever you think they go. Chances are, you’ll be wrong anyway. But once you get a better idea of exactly how things are going to shape up, you’re in a much better position to make these kinds of guesses.

As far as extension sources, we have a Code directory under trunk that contains a directory for python and a directory for various other languages. Personally, I’m more inclined to try putting any extension code into its own repository next time around.

With that said, I go back to my initial point: don’t make too big a deal out of it. Put it somewhere that seems to work for you. If you find something that doesn’t work, it can (and should) be changed.


回答 7

最好使用setuptools中package_data支持将非Python数据捆绑到您的Python模块中。我强烈建议您使用命名空间包来创建多个项目可以使用的共享命名空间,这很像Java约定(将软件包放入其中并能够拥有一个共享命名空间)。com.yourcompany.yourprojectcom.yourcompany.utils

重新分支和合并,如果您使用足够好的源代码控制系统,它将通过重命名来处理合并;集市在这方面尤其擅长。

与这里的其他答案相反,我对拥有src顶级目录(带有doctest目录并在旁边)+1 。文档目录树的特定约定将根据您所使用的内容而有所不同。例如,Sphinx有其快速启动工具支持的自己的约定。

请,请利用setuptools和pkg_resources;这使其他项目更容易依赖于代码的特定版本(如果使用,则多个版本可以与不同的非代码文件同时安装package_data)。

Non-python data is best bundled inside your Python modules using the package_data support in setuptools. One thing I strongly recommend is using namespace packages to create shared namespaces which multiple projects can use — much like the Java convention of putting packages in com.yourcompany.yourproject (and being able to have a shared com.yourcompany.utils namespace).

Re branching and merging, if you use a good enough source control system it will handle merges even through renames; Bazaar is particularly good at this.

Contrary to some other answers here, I’m +1 on having a src directory top-level (with doc and test directories alongside). Specific conventions for documentation directory trees will vary depending on what you’re using; Sphinx, for instance, has its own conventions which its quickstart tool supports.

Please, please leverage setuptools and pkg_resources; this makes it much easier for other projects to rely on specific versions of your code (and for multiple versions to be simultaneously installed with different non-code files, if you’re using package_data).