标签归档:package

-m开关的作用是什么?

问题:-m开关的作用是什么?

你能给我解释一下打电话之间有什么区别

python -m mymod1 mymod2.py args

python mymod1.py mymod2.py args

看来在这两种情况下mymod1.py被调用,sys.argv

['mymod1.py', 'mymod2.py', 'args']

那么,该-m开关是做什么用的呢?

Could you explain to me what the difference is between calling

python -m mymod1 mymod2.py args

and

python mymod1.py mymod2.py args

It seems in both cases mymod1.py is called and sys.argv is

['mymod1.py', 'mymod2.py', 'args']

So what is the -m switch for?


回答 0

PEP 338Rationale部分的第一行说:

Python 2.4添加了命令行开关-m,以允许使用Python模块命名空间定位模块以作为脚本执行。激励性的示例是标准库模块,例如pdb和profile,并且Python 2.4实现对于此有限的目的是合适的。

因此,您可以通过这种方式在Python的搜索路径中指定任何模块,而不仅仅是当前目录中的文件。您是正确的,python mymod1.py mymod2.py args其效果完全相同。本Scope of this proposal节的第一行指出:

在Python 2.4中,将执行使用-m定位的模块,就像在命令行中提供了其文件名一样。

还有-m更多的可能,例如使用作为包装一部分的模块等,这就是PEP 338其余部分的意义。阅读以获取更多信息。

The first line of the Rationale section of PEP 338 says:

Python 2.4 adds the command line switch -m to allow modules to be located using the Python module namespace for execution as scripts. The motivating examples were standard library modules such as pdb and profile, and the Python 2.4 implementation is fine for this limited purpose.

So you can specify any module in Python’s search path this way, not just files in the current directory. You’re correct that python mymod1.py mymod2.py args has exactly the same effect. The first line of the Scope of this proposal section states:

In Python 2.4, a module located using -m is executed just as if its filename had been provided on the command line.

With -m more is possible, like working with modules which are part of a package, etc. That’s what the rest of PEP 338 is about. Read it for more info.


回答 1

值得一提的是,只有在程序包具有文件的情况下__main__.py,此方法才有效。否则,该程序包无法直接执行。

python -m some_package some_arguments

python解释器将__main__.py在包路径中查找要执行的文件。等效于:

python path_to_package/__main__.py somearguments

它将在以下时间执行内容:

if __name__ == "__main__":

It’s worth mentioning this only works if the package has a file __main__.py Otherwise, this package can not be executed directly.

python -m some_package some_arguments

The python interpreter will looking for a __main__.py file in the package path to execute. It’s equivalent to:

python path_to_package/__main__.py somearguments

It will execute the content after:

if __name__ == "__main__":

回答 2

在我看来,尽管已经多次询问并回答了这个问题(例如,在这里在这里在这里在这里),但是没有一个现有的答案可以完全或简洁地捕捉到该-m标志的所有含义。因此,以下将尝试改进之前的内容。

简介(TLDR)

-m命令执行了很多操作,并非始终需要所有这些命令。简而言之:(1)允许通过模块名而不是文件名执行python脚本(2)允许选择要添加到的目录以sys.path进行import解析,(3)允许从命令行执行具有相对导入的python脚本。

初赛

为了解释-m标志,我们首先必须弄清楚一些术语。

首先,Python的主要组织单位称为模块。模块有两种形式之一:代码模块和包模块。代码模块是包含python可执行代码的任何文件。软件包模块是包含其他模块(代码模块或软件包模块)的目录。代码模块的最常见类型是*.py文件,而软件包模块的最常见类型是包含__init__.py文件的目录。

其次,可以通过两种不同的方式唯一标识所有模块:<modulename><filename>。模块通常由Python代码中的模块名称(例如import <modulename>)和命令行上的文件名(例如)来标识python <filename>。所有Python解释器都可以通过一组定义良好的规则将模块名转换为文件名。这些规则取决于sys.path变量,因此可以通过更改此值来更改映射(有关如何完成此操作的更多信息,请参阅PEP 302)。

第三,所有模块(代码和程序包)都可以执行(这意味着与模块关联的代码将由Python解释器评估)。根据执行方法和模块类型的不同,对哪些代码进行评估以及何时修改可能会有所不同。例如,如果一个人通过执行一个包模块,python <filename>那么<filename>/__init__.py它将被评估,然后是<filename>/__main__.py。另一方面,如果一个人通过执行相同的程序包模块,import <modulename>那么__init__.py将仅执行程序包。

的历史发展 -m

-m标志最初是在Python 2.4.1中引入的。最初,它的唯一目的是提供一种识别要执行的python模块的替代方法。也就是说,如果我们同时知道模块的<filename><modulename>,则以下两个命令是等效的:python <filename> <args>python -m <modulename> <args>。另外,根据PEP 338,此迭代-m仅适用于顶级模块名称(即,可以直接在sys.path上找到的模块,而无需任何中间包)。

随着完成PEP 338-m功能扩展到支持<modulename>超出顶层modulenames表示。这意味着http.server现在已经完全支持诸如这样的名称。此增强功能还意味着模块中的所有软件包现在都已加载(即,所有软件包__init__.py文件均已评估)。

PEP 366-m带来了最终的主要功能增强。通过此更新,不仅可以支持绝对导入,还可以支持显式相对导入。这是通过修改命令中命名模块的变量来实现的。-m__package__-m

用例

-m标志有两种值得注意的用例:

  1. 从命令行执行可能不知道其文件名的模块。该用例利用了Python解释器知道如何将模块名转换为文件名这一事实。当要从命令行运行stdlib模块或第三方模块时,这特别有利。例如,很少有人知道http.server模块的文件名,但大多数人确实知道其模块名,因此我们可以使用从命令行执行它python -m http.server

  2. 要执行包含绝对导入的本地软件包,而无需安装它。PEP 338中详细介绍了该用例,并利用了将当前工作目录添加到sys.path而不是模块目录的事实。该用例与pip install -e .在开发/编辑模式下安装软件包非常相似。

缺点

经过-m多年的改进,它仍然存在一个主要缺点-它只能执行以python编写的代码模块(即* .py)。例如,如果-m用于执行C编译代码模块,则会产生以下错误,No code object available for <modulename>(请参见此处以获取更多详细信息)。

详细比较

通过python命令执行模块的效果(即python <filename>):

  • sys.path 修改为包括最终目录 <filename>
  • __name__ 设定为 '__main__'
  • __package__ 设定为 None
  • __init__.py 不评估任何软件包(包括其自身的软件包模块)
  • __main__.py评估包装模块;对代码进行代码模块评估。

通过import语句(即import <modulename>)执行模块的影响:

  • sys.path以任何方式修改
  • __name__ 设置为的绝对形式 <modulename>
  • __package__ 设置为中的直接父包 <modulename>
  • __init__.py 针对所有软件包进行评估(包括针对软件包模块的评估)
  • __main__.py评价包模块; 对代码进行代码模块评估

通过-m标志(即python -m <modulename>)执行模块的影响:

  • sys.path 修改为包括当前目录
  • __name__ 设定为 '__main__'
  • __package__ 设置为中的直接父包 <modulename>
  • __init__.py 针对所有软件包进行评估(包括针对软件包模块的评估)
  • __main__.py评估包装模块;对代码进行代码模块评估

结论

-m最简单的角度来看,该标志是使用模块名而不是文件名从命令行执行python脚本的一种方法。另外,-m提供了附加功能,结合了import语句的功能(例如,支持显式相对导入和自动包__init__评估)和python命令行的便利性。

Despite this question having been asked and answered several times (e.g., here, here, here, and here) in my opinion no existing answer fully or concisely captures all the implications of the -m flag. Therefore, the following will attempt to improve on what has come before.

Introduction (TLDR)

The -m flag does a lot of things, not all of which will be needed all the time. In short it can be used to: (1) execute python code from the command line via modulename rather than filename (2) add a directory to sys.path for use in import resolution and (3) execute python code that contains relative imports from the command line.

Preliminaries

To explain the -m flag we first need to explain a little terminology.

Python’s primary organizational unit is known as a module. Module’s come in one of two flavors: code modules and package modules. A code module is any file that contains python executable code. A package module is a directory that contains other modules (either code modules or package modules). The most common type of code modules are *.py files while the most common type of package modules are directories containing an __init__.py file.

Python allows modules to be uniquely identified in two distinct ways: modulename and filename. In general, modules are identified by modulename in Python code (e.g., import <modulename>) and by filename on the command line (e.g., python <filename>). All python interpreters are able to convert modulenames to filenames by following the same few, well-defined rules. These rules hinge on the sys.path variable. By altering this variable one can change how Python resolves modulenames into filenames (for more on how this is done see PEP 302).

All modules (both code and package) can be executed (i.e., code associated with the module will be evaluated by the Python interpreter). Depending on the execution method (and module type) what code gets evaluated, and when, can change quite a bit. For example, if one executes a package module via python <filename> then <filename>/__init__.py will be evaluated followed by <filename>/__main__.py. On the other hand, if one executes that same package module via import <modulename> then only the package’s __init__.py will be executed.

Historical Development of -m

The -m flag was first introduced in Python 2.4.1. Initially its only purpose was to provide an alternative means of identifying the python module to execute from the command line. That is, if we knew both the <filename> and <modulename> for a module then the following two commands were equivalent: python <filename> <args> and python -m <modulename> <args>. One constraint with this iteration, according to PEP 338, was that -m only worked with top level modulenames (i.e., modules that could be found directly on sys.path without any intervening package modules).

With the completion of PEP 338 the -m feature was extended to support <modulename> representations beyond the top level. This meant names such as http.server were now fully supported. This extension also meant that each parent package in modulename was now evaluated (i.e., all parent package __init__.py files were evaluated) in addition to the module referenced by the modulename itself.

The final major feature enhancement for -m came with PEP 366. With this upgrade -m gained the ability to support not only absolute imports but also explicit relative imports when executing modules. This was achieved by changing -m so that it set the __package__ variable to the parent module of the given modulename (in addition to everything else it already did).

Use Cases

There are two notable use cases for the -m flag:

  1. To execute modules from the command line for which one may not know their filename. This use case takes advantage of the fact that the Python interpreter knows how to convert modulenames to filenames. This is particularly advantageous when one wants to run stdlib modules or 3rd-party module from the command line. For example, very few people know the filename for the http.server module but most people do know its modulename so we can execute it from the command line using python -m http.server.

  2. To execute a local package containing absolute or relative imports without needing to install it. This use case is detailed in PEP 338 and leverages the fact that the current working directory is added to sys.path rather than the module’s directory. This use case is very similar to using pip install -e . to install a package in develop/edit mode.

Shortcomings

With all the enhancements made to -m over the years it still has one major shortcoming — it can only execute modules written in Python (i.e., *.py). For example, if -m is used to execute a C compiled code module the following error will be produced, No code object available for <modulename> (see here for more details).

Detailed Comparisons

Effects of module execution via import statement (i.e., import <modulename>):

  • sys.path is not modified in any way
  • __name__ is set to the absolute form of <modulename>
  • __package__ is set to the immediate parent package in <modulename>
  • __init__.py is evaluated for all packages (including its own for package modules)
  • __main__.py is not evaluated for package modules; the code is evaluated for code modules

Effects of module execution via command line (i.e., python <filename>):

  • sys.path is modified to include the final directory in <filename>
  • __name__ is set to '__main__'
  • __package__ is set to None
  • __init__.py is not evaluated for any package (including its own for package modules)
  • __main__.py is evaluated for package modules; the code is evaluated for code modules.

Effects of module execution via command line with the -m flag (i.e., python -m <modulename>):

  • sys.path is modified to include the current directory
  • __name__ is set to '__main__'
  • __package__ is set to the immediate parent package in <modulename>
  • __init__.py is evaluated for all packages (including its own for package modules)
  • __main__.py is evaluated for package modules; the code is evaluated for code modules

Conclusion

The -m flag is, at its simplest, a means to execute python scripts from the command line by using modulenames rather than filenames. The real power of -m, however, is in its ability to combine the power of import statements (e.g., support for explicit relative imports and automatic package __init__ evaluation) with the convenience of the command line.


访问包子目录中的数据

问题:访问包子目录中的数据

我正在编写一个python软件包,其中包含需要在./data/子目录中打开数据文件的模块。现在,我已经将文件的路径硬编码到了我的类和函数中。我想编写更健壮的代码,无论子目录在用户系统上的安装位置如何,都可以访问该子目录。

我尝试了多种方法,但是到目前为止,我还没有运气。似乎大多数“当前目录”命令返回系统的python解释器的目录,而不是模块的目录。

看来这应该是一个微不足道的普遍问题。但是我似乎无法弄清楚。问题的部分原因是我的数据文件不是.py文件,因此我不能使用导入功能等。

有什么建议?

现在,我的包目录如下所示:

/
__init__.py
module1.py
module2.py
data/   
   data.txt

我试图访问data.txt距离module*.py

I am writing a python package with modules that need to open data files in a ./data/ subdirectory. Right now I have the paths to the files hardcoded into my classes and functions. I would like to write more robust code that can access the subdirectory regardless of where it is installed on the user’s system.

I’ve tried a variety of methods, but so far I have had no luck. It seems that most of the “current directory” commands return the directory of the system’s python interpreter, and not the directory of the module.

This seems like it ought to be a trivial, common problem. Yet I can’t seem to figure it out. Part of the problem is that my data files are not .py files, so I can’t use import functions and the like.

Any suggestions?

Right now my package directory looks like:

/
__init__.py
module1.py
module2.py
data/   
   data.txt

I am trying to access data.txt from module*.py!


回答 0

您可以使用__file__获取包的路径,如下所示:

import os
this_dir, this_filename = os.path.split(__file__)
DATA_PATH = os.path.join(this_dir, "data", "data.txt")
print open(DATA_PATH).read()

You can use __file__ to get the path to the package, like this:

import os
this_dir, this_filename = os.path.split(__file__)
DATA_PATH = os.path.join(this_dir, "data", "data.txt")
print open(DATA_PATH).read()

回答 1

执行此操作的标准方法是使用setuptools软件包和pkg_resources。

您可以按照以下层次结构布置软件包,并按照以下链接配置软件包设置文件以将其指向您的数据资源:

http://docs.python.org/distutils/setupscript.html#installing-package-data

然后,您可以按照以下链接使用pkg_resources重新查找和使用这些文件:

http://peak.telecommunity.com/DevCenter/PkgResources#basic-resource-access

import pkg_resources

DATA_PATH = pkg_resources.resource_filename('<package name>', 'data/')
DB_FILE = pkg_resources.resource_filename('<package name>', 'data/sqlite.db')

The standard way to do this is with setuptools packages and pkg_resources.

You can lay out your package according to the following hierarchy, and configure the package setup file to point it your data resources, as per this link:

http://docs.python.org/distutils/setupscript.html#installing-package-data

You can then re-find and use those files using pkg_resources, as per this link:

http://peak.telecommunity.com/DevCenter/PkgResources#basic-resource-access

import pkg_resources

DATA_PATH = pkg_resources.resource_filename('<package name>', 'data/')
DB_FILE = pkg_resources.resource_filename('<package name>', 'data/sqlite.db')

回答 2

提供今天可以使用的解决方案。绝对使用此API不会重塑所有这些轮子。

需要一个真实的文件系统文件名。压缩的鸡蛋将被提取到缓存目录中:

from pkg_resources import resource_filename, Requirement

path_to_vik_logo = resource_filename(Requirement.parse("enb.portals"), "enb/portals/reports/VIK_logo.png")

返回指定资源的可读文件状对象;它可能是实际文件,StringIO或某些类似的对象。从某种意义上说,该流处于“二进制模式”,即资源中的任何字节都将按原样读取。

from pkg_resources import resource_stream, Requirement

vik_logo_as_stream = resource_stream(Requirement.parse("enb.portals"), "enb/portals/reports/VIK_logo.png")

使用pkg_resources进行软件包发现和资源访问

To provide a solution working today. Definitely use this API to not reinvent all those wheels.

A true filesystem filename is needed. Zipped eggs will be extracted to a cache directory:

from pkg_resources import resource_filename, Requirement

path_to_vik_logo = resource_filename(Requirement.parse("enb.portals"), "enb/portals/reports/VIK_logo.png")

Return a readable file-like object for the specified resource; it may be an actual file, a StringIO, or some similar object. The stream is in “binary mode”, in the sense that whatever bytes are in the resource will be read as-is.

from pkg_resources import resource_stream, Requirement

vik_logo_as_stream = resource_stream(Requirement.parse("enb.portals"), "enb/portals/reports/VIK_logo.png")

Package Discovery and Resource Access using pkg_resources


回答 3

做出详细的代码无法按原样工作的答案通常是没有意义的,但是我认为这是一个exceptions。Python 3.7添加importlib.resources了应该替换的pkg_resources。它可以用于访问名称中没有斜杠的软件包中的文件,即

foo/
    __init__.py
    module1.py
    module2.py
    data/   
       data.txt
    data2.txt

即您可以使用例如访问data2.txt内部软件包foo

importlib.resources.open_binary('foo', 'data2.txt')

但是它会失败,但有一个exceptions

>>> importlib.resources.open_binary('foo', 'data/data.txt')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/importlib/resources.py", line 87, in open_binary
    resource = _normalize_path(resource)
  File "/usr/lib/python3.7/importlib/resources.py", line 61, in _normalize_path
    raise ValueError('{!r} must be only a file name'.format(path))
ValueError: 'data/data2.txt' must be only a file name

这不能被固定,除了通过将__init__.pydata再使用它作为一个包:

importlib.resources.open_binary('foo.data', 'data.txt')

这种行为的原因是“这是设计使然”;但是设计可能会改变

There is often not point in making an answer that details code that does not work as is, but I believe this to be an exception. Python 3.7 added importlib.resources that is supposed to replace pkg_resources. It would work for accessing files within packages that do not have slashes in their names, i.e.

foo/
    __init__.py
    module1.py
    module2.py
    data/   
       data.txt
    data2.txt

i.e. you could access data2.txt inside package foo with for example

importlib.resources.open_binary('foo', 'data2.txt')

but it would fail with an exception for

>>> importlib.resources.open_binary('foo', 'data/data.txt')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/importlib/resources.py", line 87, in open_binary
    resource = _normalize_path(resource)
  File "/usr/lib/python3.7/importlib/resources.py", line 61, in _normalize_path
    raise ValueError('{!r} must be only a file name'.format(path))
ValueError: 'data/data2.txt' must be only a file name

This cannot be fixed except by placing __init__.py in data and then using it as a package:

importlib.resources.open_binary('foo.data', 'data.txt')

The reason for this behaviour is “it is by design”; but the design might change


回答 4

您需要为整个模块命名,目录树没有列出详细信息,对我来说这是可行的:

import pkg_resources
print(    
    pkg_resources.resource_filename(__name__, 'data/data.txt')
)

值得注意的是,setuptools似乎无法基于与打包数据文件匹配的名称来解析文件,所以无论如何,您都必须data/几乎包含前缀。os.path.join('data', 'data.txt)如果需要备用目录分隔符,可以使用。通常,我发现硬编码的unix样式目录分隔符没有兼容性问题。

You need a name for your whole module, you’re given directory tree doesn’t list that detail, for me this worked:

import pkg_resources
print(    
    pkg_resources.resource_filename(__name__, 'data/data.txt')
)

Notibly setuptools does not appear to resolve files based on a name match with packed data files, soo you’re gunna have to include the data/ prefix pretty much no matter what. You can use os.path.join('data', 'data.txt) if you need alternate directory separators, Generally I find no compatibility problems with hard-coded unix style directory separators though.


回答 5

我想我找到了答案。

我创建一个模块data_path.py,将其导入其他包含以下内容的模块:

data_path = os.path.join(os.path.dirname(__file__),'data')

然后我用打开所有文件

open(os.path.join(data_path,'filename'), <param>)

I think I hunted down an answer.

I make a module data_path.py, which I import into my other modules containing:

data_path = os.path.join(os.path.dirname(__file__),'data')

And then I open all my files with

open(os.path.join(data_path,'filename'), <param>)

检查是否安装了Python软件包

问题:检查是否安装了Python软件包

检查软件包是否在Python脚本中安装的好方法是什么?我知道从解释器很容易,但是我需要在脚本中完成。

我想我可以检查安装过程中在系统上是否创建了目录,但是我觉得有更好的方法。我试图确保已安装Skype4Py软件包,如果没有,我将安装它。

我完成支票的想法

  • 检查典型安装路径中的目录
  • 尝试导入软件包,如果抛出异常,则安装软件包

What’s a good way to check if a package is installed while within a Python script? I know it’s easy from the interpreter, but I need to do it within a script.

I guess I could check if there’s a directory on the system that’s created during the installation, but I feel like there’s a better way. I’m trying to make sure the Skype4Py package is installed, and if not I’ll install it.

My ideas for accomplishing the check

  • check for a directory in the typical install path
  • try to import the package and if an exception is throw, then install package

回答 0

如果您的意思是python脚本,请执行以下操作:

Python 3.3+使用sys.modules和find_spec

import importlib.util
import sys

# For illustrative purposes.
name = 'itertools'

if name in sys.modules:
    print(f"{name!r} already in sys.modules")
elif (spec := importlib.util.find_spec(name)) is not None:
    # If you choose to perform the actual import ...
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module
    spec.loader.exec_module(module)
    print(f"{name!r} has been imported")
else:
    print(f"can't find the {name!r} module")

Python 3:

try:
    import mymodule
except ImportError as e:
    pass  # module doesn't exist, deal with it.

Python 2:

try:
    import mymodule
except ImportError, e:
    pass  # module doesn't exist, deal with it.

If you mean a python script, just do something like this:

Python 3.3+ use sys.modules and find_spec:

import importlib.util
import sys

# For illustrative purposes.
name = 'itertools'

if name in sys.modules:
    print(f"{name!r} already in sys.modules")
elif (spec := importlib.util.find_spec(name)) is not None:
    # If you choose to perform the actual import ...
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module
    spec.loader.exec_module(module)
    print(f"{name!r} has been imported")
else:
    print(f"can't find the {name!r} module")

Python 3:

try:
    import mymodule
except ImportError as e:
    pass  # module doesn't exist, deal with it.

Python 2:

try:
    import mymodule
except ImportError, e:
    pass  # module doesn't exist, deal with it.

回答 1

更新的答案

更好的方法是:

import subprocess
import sys

reqs = subprocess.check_output([sys.executable, '-m', 'pip', 'freeze'])
installed_packages = [r.decode().split('==')[0] for r in reqs.split()]

结果:

print(installed_packages)

[
    "Django",
    "six",
    "requests",
]

检查是否requests已安装:

if 'requests' in installed_packages:
    # Do something

为什么这样呢?有时您会遇到应用名称冲突。从应用程序命名空间导入无法全面了解系统上已安装的内容。

注意,建议的解决方案有效:

  • 使用pip从PyPI或任何其他替代来源(例如pip install http://some.site/package-name.zip或任何其他存档类型)进行安装时。
  • 使用手动安装时python setup.py install
  • 从系统存储库安装时,例如sudo apt install python-requests

情况下,当它可能无法正常工作:

  • 在开发模式下安装时,例如python setup.py develop
  • 在开发模式下安装时,例如pip install -e /path/to/package/source/

旧答案

更好的方法是:

import pip
installed_packages = pip.get_installed_distributions()

对于pip> = 10.x,请使用:

from pip._internal.utils.misc import get_installed_distributions

为什么这样呢?有时您会遇到应用名称冲突。从应用程序命名空间导入无法全面了解系统上已安装的内容。

结果,您得到一个pkg_resources.Distribution对象列表。请参阅以下示例:

print installed_packages
[
    "Django 1.6.4 (/path-to-your-env/lib/python2.7/site-packages)",
    "six 1.6.1 (/path-to-your-env/lib/python2.7/site-packages)",
    "requests 2.5.0 (/path-to-your-env/lib/python2.7/site-packages)",
]

列出清单:

flat_installed_packages = [package.project_name for package in installed_packages]

[
    "Django",
    "six",
    "requests",
]

检查是否requests已安装:

if 'requests' in flat_installed_packages:
    # Do something

Updated answer

A better way of doing this is:

import subprocess
import sys

reqs = subprocess.check_output([sys.executable, '-m', 'pip', 'freeze'])
installed_packages = [r.decode().split('==')[0] for r in reqs.split()]

The result:

print(installed_packages)

[
    "Django",
    "six",
    "requests",
]

Check if requests is installed:

if 'requests' in installed_packages:
    # Do something

Why this way? Sometimes you have app name collisions. Importing from the app namespace doesn’t give you the full picture of what’s installed on the system.

Note, that proposed solution works:

  • When using pip to install from PyPI or from any other alternative source (like pip install http://some.site/package-name.zip or any other archive type).
  • When installing manually using python setup.py install.
  • When installing from system repositories, like sudo apt install python-requests.

Cases when it might not work:

  • When installing in development mode, like python setup.py develop.
  • When installing in development mode, like pip install -e /path/to/package/source/.

Old answer

A better way of doing this is:

import pip
installed_packages = pip.get_installed_distributions()

For pip>=10.x use:

from pip._internal.utils.misc import get_installed_distributions

Why this way? Sometimes you have app name collisions. Importing from the app namespace doesn’t give you the full picture of what’s installed on the system.

As a result, you get a list of pkg_resources.Distribution objects. See the following as an example:

print installed_packages
[
    "Django 1.6.4 (/path-to-your-env/lib/python2.7/site-packages)",
    "six 1.6.1 (/path-to-your-env/lib/python2.7/site-packages)",
    "requests 2.5.0 (/path-to-your-env/lib/python2.7/site-packages)",
]

Make a list of it:

flat_installed_packages = [package.project_name for package in installed_packages]

[
    "Django",
    "six",
    "requests",
]

Check if requests is installed:

if 'requests' in flat_installed_packages:
    # Do something

回答 2

从Python 3.3开始,您可以使用find_spec()方法

import importlib.util

# For illustrative purposes.
package_name = 'pandas'

spec = importlib.util.find_spec(package_name)
if spec is None:
    print(package_name +" is not installed")

As of Python 3.3, you can use the find_spec() method

import importlib.util

# For illustrative purposes.
package_name = 'pandas'

spec = importlib.util.find_spec(package_name)
if spec is None:
    print(package_name +" is not installed")

回答 3

如果要从终端机取支票,可以运行

pip3 show package_name

如果未返回任何内容,则表示未安装该软件包。

如果您想自动执行此检查,以便例如可以在丢失时安装它,则可以在bash脚本中包含以下内容:

pip3 show package_name 1>/dev/null #pip for Python 2
if [ $? == 0 ]; then
   echo "Installed" #Replace with your actions
else
   echo "Not Installed" #Replace with your actions, 'pip3 install --upgrade package_name' ?
fi

If you want to have the check from the terminal, you can run

pip3 show package_name

and if nothing is returned, the package is not installed.

If perhaps you want to automate this check, so that for example you can install it if missing, you can have the following in your bash script:

pip3 show package_name 1>/dev/null #pip for Python 2
if [ $? == 0 ]; then
   echo "Installed" #Replace with your actions
else
   echo "Not Installed" #Replace with your actions, 'pip3 install --upgrade package_name' ?
fi

回答 4

作为此答案的扩展:

对于Python 2. *,pip show <package_name>将执行相同的任务。

例如pip show numpy将返回以下内容:

Name: numpy
Version: 1.11.1
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@scipy.org
License: BSD
Location: /home/***/anaconda2/lib/python2.7/site-packages
Requires: 
Required-by: smop, pandas, tables, spectrum, seaborn, patsy, odo, numpy-stl, numba, nfft, netCDF4, MDAnalysis, matplotlib, h5py, GridDataFormats, dynd, datashape, Bottleneck, blaze, astropy

As an extension of this answer:

For Python 2.*, pip show <package_name> will perform the same task.

For example pip show numpy will return the following or alike:

Name: numpy
Version: 1.11.1
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@scipy.org
License: BSD
Location: /home/***/anaconda2/lib/python2.7/site-packages
Requires: 
Required-by: smop, pandas, tables, spectrum, seaborn, patsy, odo, numpy-stl, numba, nfft, netCDF4, MDAnalysis, matplotlib, h5py, GridDataFormats, dynd, datashape, Bottleneck, blaze, astropy

回答 5

您可以使用setuptools中的pkg_resources模块。例如:

import pkg_resources

package_name = 'cool_package'
try:
    cool_package_dist_info = pkg_resources.get_distribution(package_name)
except pkg_resources.DistributionNotFound:
    print('{} not installed'.format(package_name))
else:
    print(cool_package_dist_info)

请注意,python模块和python包之间有区别。一个软件包可以包含多个模块,并且模块名称可能与软件包名称不匹配。

You can use the pkg_resources module from setuptools. For example:

import pkg_resources

package_name = 'cool_package'
try:
    cool_package_dist_info = pkg_resources.get_distribution(package_name)
except pkg_resources.DistributionNotFound:
    print('{} not installed'.format(package_name))
else:
    print(cool_package_dist_info)

Note that there is a difference between python module and a python package. A package can contain multiple modules and module’s names might not match the package name.


回答 6

打开命令提示符类型

pip3 list

Open your command prompt type

pip3 list

回答 7

我想对此主题添加一些想法/发现。我正在编写一个脚本,检查定制程序的所有要求。python模块也有很多检查。

有一个小问题

try:
   import ..
except:
   ..

解。在我的情况下,其中一个python模块称为python-nmap,但是您使用导入了它,import nmap并且看到名称不匹配。因此,使用上述解决方案进行的测试将返回False结果,并且还会在命中时导入该模块,但对于简单的测试/检查,可能无需使用大量内存。

我也发现

import pip
installed_packages = pip.get_installed_distributions()

installed_packages只有pip安装了软件包。在我的系统上,pip freeze通过40python模块返回,而installed_packages只有1,我手动安装了该模块(python-nmap)。

下面我知道的另一种解决方案可能与该问题无关,但是我认为将测试功能与执行安装的功能分开是一种很好的做法,这可能对某些人有用。

对我有用的解决方案。它基于此答案如何在不导入的情况下检查python模块是否存在

from imp import find_module

def checkPythonmod(mod):
    try:
        op = find_module(mod)
        return True
    except ImportError:
        return False

注意:此解决方案也无法通过名称找到模块python-nmap,我必须nmap改用(易于使用),但是在这种情况下,模块将不会加载到内存中。

I’d like to add some thoughts/findings of mine to this topic. I’m writing a script that checks all requirements for a custom made program. There are many checks with python modules too.

There’s a little issue with the

try:
   import ..
except:
   ..

solution. In my case one of the python modules called python-nmap, but you import it with import nmap and as you see the names mismatch. Therefore the test with the above solution returns a False result, and it also imports the module on hit, but maybe no need to use a lot of memory for a simple test/check.

I also found that

import pip
installed_packages = pip.get_installed_distributions()

installed_packages will have only the packages has been installed with pip. On my system pip freeze returns over 40 python modules, while installed_packages has only 1, the one I installed manually (python-nmap).

Another solution below that I know it may not relevant to the question, but I think it’s a good practice to keep the test function separate from the one that performs the install it might be useful for some.

The solution that worked for me. It based on this answer How to check if a python module exists without importing it

from imp import find_module

def checkPythonmod(mod):
    try:
        op = find_module(mod)
        return True
    except ImportError:
        return False

NOTE: this solution can’t find the module by the name python-nmap too, I have to use nmap instead (easy to live with) but in this case the module won’t be loaded to the memory whatsoever.


回答 8

如果您希望脚本安装缺少的软件包并继续,则可以执行以下操作(在“ python-krbV”软件包中的“ krbV”模块示例中):

import pip
import sys

for m, pkg in [('krbV', 'python-krbV')]:
    try:
        setattr(sys.modules[__name__], m, __import__(m))
    except ImportError:
        pip.main(['install', pkg])
        setattr(sys.modules[__name__], m, __import__(m))

If you’d like your script to install missing packages and continue, you could do something like this (on example of ‘krbV’ module in ‘python-krbV’ package):

import pip
import sys

for m, pkg in [('krbV', 'python-krbV')]:
    try:
        setattr(sys.modules[__name__], m, __import__(m))
    except ImportError:
        pip.main(['install', pkg])
        setattr(sys.modules[__name__], m, __import__(m))

回答 9

一种快速的方法是使用python命令行工具。只需键入,import <your module name> 如果缺少模块,则会看到错误。

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
>>> import sys
>>> import jocker
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named jocker
$

A quick way is to use python command line tool. Simply type import <your module name> You see an error if module is missing.

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
>>> import sys
>>> import jocker
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named jocker
$

回答 10

嗯…我看到的最方便的答案是使用命令行尝试导入。但我什至宁愿避免这种情况。

冻结点如何?grep pkgname’?我试过了,效果很好。它还显示了它具有的版本以及是在版本控制(安装)下还是可编辑(开发)下安装的。

Hmmm … the closest I saw to a convenient answer was using the command line to try the import. But I prefer to even avoid that.

How about ‘pip freeze | grep pkgname’? I tried it and it works well. It also shows you the version it has and whether it is installed under version control (install) or editable (develop).


回答 11

类myError(exception):通过#或做一些尝试:导入mymodule,除了ImportError,例如e:提高myError(“发生错误”)

You can use this:

class myError(exception):
 pass # Or do some thing like this.
try:
 import mymodule
except ImportError as e:
 raise myError("error was occurred")

回答 12

在终端类型

pip show some_package_name

pip show matplotlib

In the Terminal type

pip show some_package_name

Example

pip show matplotlib

回答 13

我想评论@ ice.nicer的回复,但我不能,所以… 我的观察是带有破折号的软件包都带有下划线,而不仅仅是@dwich注释所指出的点。

例如,您这样做pip3 install sphinx-rtd-theme,但是:

  • importlib.util.find_spec(sphinx_rtd_theme) 返回一个对象
  • importlib.util.find_spec(sphinx-rtd-theme) 不返回
  • importlib.util.find_spec(sphinx.rtd.theme) 引发ModuleNotFoundError

此外,某些名称已完全更改。例如,您这样做,pip3 install pyyaml但是将其另存为yaml

我正在使用python3.8

I would like to comment to @ice.nicer reply but I cannot, so … My observations is that packages with dashes are saved with underscores, not only with dots as pointed out by @dwich comment

For example, you do pip3 install sphinx-rtd-theme, but:

  • importlib.util.find_spec(sphinx_rtd_theme) returns an Object
  • importlib.util.find_spec(sphinx-rtd-theme) returns None
  • importlib.util.find_spec(sphinx.rtd.theme) raises ModuleNotFoundError

Moreover, some names are totally changed. For example, you do pip3 install pyyaml but it is saved simply as yaml

I am using python3.8


回答 14

if pip3 list | grep -sE '^some_command\s+[0-9]' >/dev/null
  # installed ...
else
  # not installed ...
fi
if pip3 list | grep -sE '^some_command\s+[0-9]' >/dev/null
  # installed ...
else
  # not installed ...
fi

回答 15

转到选项2。如果ImportError抛出该错误,则表示未安装该软件包(或未安装sys.path)。

Go option #2. If ImportError is thrown, then the package is not installed (or not in sys.path).


为什么从git repo进行pip安装时#egg = foo

问题:为什么从git repo进行pip安装时#egg = foo

当我执行“ pip install -e …”以从git repo安装时,我必须指定#egg = somename或pip抱怨。例如:

pip install -e git://github.com/hiidef/oauth2app.git#egg=oauth2app

这个“蛋”字符串的意义是什么?

When I do a “pip install -e …” to install from a git repo, I have to specify #egg=somename or pip complains. For example:

pip install -e git://github.com/hiidef/oauth2app.git#egg=oauth2app

What’s the significance of this “egg” string?


回答 0

每点安装-h“ egg”字符串是在安装过程中检出的目录

per pip install -h the “egg” string is the directory that gets checked out as part of the install


回答 1

您必须包含#egg = Package,这样pip才能知道该URL的期望值。参见https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support

更多鸡蛋

You have to include #egg=Package so pip knows what to expect at that URL. See https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support

more on eggs


回答 2

https://pip.pypa.io/zh_CN/stable/reference/pip_install/#vcs-support说:

pip使用其URL后缀“ egg =-”的“项目名称”组件的依赖关系逻辑在pip下载和分析元数据之前识别项目。蛋名称的可选“版本”组件在功能上并不重要。它仅提供有关使用哪个版本的信息。对于setup.py不在项目根目录中的项目,将使用“子目录”组件。“子目录”组件的值应该是从项目根目录到setup.py所在位置的路径。

据此,我推断出egg值仅用于依赖性检查,因此,我认为按照惯例,some-pypi-package-name应使用包名称(即),而不要使用任何包含的文件夹(即some_pypi_package_name

https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support says:

The “project name” component of the url suffix “egg=-” is used by pip in its dependency logic to identify the project prior to pip downloading and analyzing the metadata. The optional “version” component of the egg name is not functionally important. It merely provides a human-readable clue as to what version is in use. For projects where setup.py is not in the root of project, “subdirectory” component is used. Value of “subdirectory” component should be a path starting from root of the project to where setup.py is located.

From this I deduce that the egg value is only used for dependency checks and therefore I think, by convention, the package name (i.e. some-pypi-package-name) should be used, not any contained folder (i.e. some_pypi_package_name)


回答 3

一个Egg只是一些捆绑的python代码。在git网址中,egg是项目名称。 VCS支持

通常,我们从Pypi安装python软件包,因此您仅指定软件包名称和版本(如果未指定,则假定为最新版本)。然后,Pypi搜索您想要的鸡蛋,然后pip安装该鸡蛋。 pip install celery将安装最新发布的鸡蛋,pip install celery[redis]并安装包含相同celery软件包的其他鸡蛋,并从celery的setup.py中列出为Redis依赖项的软件包中安装最新的鸡蛋。

使用git和gitlab路径,您可以指定/{user|group}/{repository}.git@{tag}#egg={package-name}#egg=celery和之间有区别#egg=celery[redis],但是它们都将来自同一源代码。

除实际标签外,“标签”还可以是分支或提交哈希。master如果未指定,则假定为。

例如,git+https://github.com/celery/celery.git#egg=celery==4.3.0将检出master分支并进行安装。即使您指定了版本号,在安装中也不会考虑。版本号被忽略

通过git或其他VCS网址进行安装时,您将需要查找所需版本的标签或哈希。例如,git+https://github.com/celery/celery.git@v4.3.0#egg=celery它将签出标记为“ v4.3.0”的提交,然后从该源代码安装该软件包。假设维护人员没有过分地错误标记他们的存储库,则可以得到所需的版本。

An Egg is just some bundled python code. In a git url, the egg is the project name. VCS Support

Normally we install python packages from Pypi, so you specify ONLY the package name and version (or it assumes latest version if you don’t specify). Pypi then searches for which egg you want and pip installs that. pip install celery would install the latest published egg and pip install celery[redis] would install a different egg that contains the same celery package and also installs the the latest eggs from whatever packages were listed as dependencies for redis in celery’s setup.py.

With git and gitlab paths, you specify /{user|group}/{repository}.git@{tag}#egg={package-name}. there is a difference between #egg=celery and #egg=celery[redis], but they will both come from the same source code.

“tag” can also be a branch or commit hash in addition to an actual tag. It is assumed to be master if you do not specify.

for example, git+https://github.com/celery/celery.git#egg=celery==4.3.0 would check out the master branch and install that. Even though you specified a version number, it is not taken into account in the installation. THE VERSION NUMBER IS IGNORED

When installing via git or other VCS urls, you will want to find the tag or hash of the version you need. For example, git+https://github.com/celery/celery.git@v4.3.0#egg=celery which will checkout the commit tagged “v4.3.0” and then install the package from that source code. Assuming the maintainers did not egregiously mis-tag their repositories, you can get the version you want like that.


是否使用-m选项执行Python代码

问题:是否使用-m选项执行Python代码

python解释器的-m 模块选项为“将库模块模块作为脚本运行”。

使用此python代码a.py:

if __name__ == "__main__":
    print __package__
    print __name__

我测试python -m a

"" <-- Empty String
__main__

python a.py回报

None <-- None
__main__

对我来说,这两个调用似乎是相同的,只是当使用-m选项调用__package__时不为None。

有趣的是,有了python -m runpy a,我得到了与python -m a编译成a.pyc的python模块相同的东西。

这些调用之间的(实际)区别是什么?他们之间有什么利弊吗?

同样,David Beazley的Python Essential Reference将其解释为“ -m选项将库模块作为脚本运行,该脚本在执行主脚本之前在__main__模块内部执行 ”。这是什么意思?

The python interpreter has -m module option that “Runs library module module as a script”.

With this python code a.py:

if __name__ == "__main__":
    print __package__
    print __name__

I tested python -m a to get

"" <-- Empty String
__main__

whereas python a.py returns

None <-- None
__main__

To me, those two invocation seems to be the same except __package__ is not None when invoked with -m option.

Interestingly, with python -m runpy a, I get the same as python -m a with python module compiled to get a.pyc.

What’s the (practical) difference between these invocations? Any pros and cons between them?

Also, David Beazley’s Python Essential Reference explains it as “The -m option runs a library module as a script which executes inside the __main__ module prior to the execution of the main script“. What does it mean?


回答 0

当您使用-m命令行标志时,Python将为您导入模块或包,然后将其作为脚本运行。当您不使用该-m标志时,您命名的文件仅作为脚本运行。

当您尝试运行软件包时,区别很重要。之间有很大的区别:

python foo/bar/baz.py

python -m foo.bar.baz

与后一种情况一样,foo.bar将导入,并且相对导入将foo.bar作为起点正确运行。

演示:

$ mkdir -p test/foo/bar
$ touch test/foo/__init__.py
$ touch test/foo/bar/__init__.py
$ cat << EOF > test/foo/bar/baz.py 
> if __name__ == "__main__":
>     print __package__
>     print __name__
> 
> EOF
$ PYTHONPATH=test python test/foo/bar/baz.py 
None
__main__
$ PYTHONPATH=test python -m foo.bar.baz 
foo.bar
__main__

结果,在使用-m开关时,Python实际上必须关心软件包。普通脚本永远不能是软件包,因此__package__将其设置为None

但运行一个封装或模块与包裹-m和现在至少存在可能性的封装的,所以__package__变量设置为一个字符串值; 在上面的演示中,将其设置为foo.bar,对于不在包内的普通模块,将其设置为空字符串。

至于__main__ 模块 ; Python会像常规模块一样导入正在运行的脚本。创建一个新的模块对象来保存存储在中的全局命名空间sys.modules['__main__']。这就是__name__变量所指的,它是该结构中的关键。

对于包,您可以创建一个__main__.py模块并在运行时让其运行python -m package_name;其实这是你的唯一途径可以运行包的脚本:

$ PYTHONPATH=test python -m foo.bar
python: No module named foo.bar.__main__; 'foo.bar' is a package and cannot be directly executed
$ cp test/foo/bar/baz.py test/foo/bar/__main__.py
$ PYTHONPATH=test python -m foo.bar
foo.bar
__main__

因此,在命名要与一起运行的包时-m,Python会查找__main__该包中包含的模块并将其作为脚本执行。然后,其名称仍设置为__main__,并且模块对象仍存储在中sys.modules['__main__']

When you use the -m command-line flag, Python will import a module or package for you, then run it as a script. When you don’t use the -m flag, the file you named is run as just a script.

The distinction is important when you try to run a package. There is a big difference between:

python foo/bar/baz.py

and

python -m foo.bar.baz

as in the latter case, foo.bar is imported and relative imports will work correctly with foo.bar as the starting point.

Demo:

$ mkdir -p test/foo/bar
$ touch test/foo/__init__.py
$ touch test/foo/bar/__init__.py
$ cat << EOF > test/foo/bar/baz.py 
> if __name__ == "__main__":
>     print __package__
>     print __name__
> 
> EOF
$ PYTHONPATH=test python test/foo/bar/baz.py 
None
__main__
$ PYTHONPATH=test python -m foo.bar.baz 
foo.bar
__main__

As a result, Python has to actually care about packages when using the -m switch. A normal script can never be a package, so __package__ is set to None.

But run a package or module inside a package with -m and now there is at least the possibility of a package, so the __package__ variable is set to a string value; in the above demonstration it is set to 'foo.bar', for plain modules not inside a package it is set to an empty string.

As for the __main__ module, Python imports scripts being run as it would import regular modules. A new module object is created to hold the global namespace and is stored in sys.modules['__main__']. This is what the __name__ variable refers to, it is a key in that structure.

For packages, you can create a __main__.py module inside and have that run when running python -m package_name; in fact that is the only way you can run a package as a script:

$ PYTHONPATH=test python -m foo.bar
python: No module named foo.bar.__main__; 'foo.bar' is a package and cannot be directly executed
$ cp test/foo/bar/baz.py test/foo/bar/__main__.py
$ PYTHONPATH=test python -m foo.bar
foo.bar
__main__

So, when naming a package for running with -m, Python looks for a __main__ module contained in that package and executes that as a script. Its name is then still set to '__main__' and the module object is still stored in sys.modules['__main__'].


回答 1

是否使用-m选项执行Python代码

使用-m标志。

当您拥有脚本时,结果几乎是相同的,但是当您开发一个没有-m标志的软件包时,如果您想运行软件包中的子软件包或模块作为主条目,则无法使导入正常工作指向您的程序(相信我,我已经尝试过了。)

该文档

就像-m标志上文档说的那样:

在sys.path中搜索指定的模块,并作为__main__模块执行其内容。

与-c选项一样,当前目录将添加到sys.path的开头。

所以

python -m pdb

大致相当于

python /usr/lib/python3.5/pdb.py

(假设您在当前目录中没有名为pdb.py的软件包或脚本)

说明:

使行为“故意类似于”脚本。

许多标准库模块包含在执行时作为脚本调用的代码。一个例子是timeit模块:

某些python代码旨在作为模块运行:(我认为此示例比命令行选项doc示例更好)

$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python -m timeit '"-".join([str(n) for n in range(100)])'
10000 loops, best of 3: 33.4 usec per loop
$ python -m timeit '"-".join(map(str, range(100)))'
10000 loops, best of 3: 25.2 usec per loop

并且从发行说明中突出显示了Python 2.4

-m命令行选项-python -m modulename将在标准库中找到一个模块,然后调用它。例如,python -m pdb 相当于python /usr/lib/python2.4/pdb.py

后续问题

同样,David Beazley的Python Essential Reference将其解释为“ -m选项将库模块作为脚本运行,该脚本__main__在执行主脚本之前在模块内部执行”。

这意味着您可以使用import语句查找的任何模块都可以作为程序的入口点运行-如果该模块具有代码块(通常在结尾处使用)if __name__ == '__main__':

-m 而不将当前目录添加到路径:

其他地方的评论说:

-m选项还将当前目录添加到sys.path中,显然是一个安全问题(请参阅:预加载攻击)。此行为类似于Windows中的库搜索顺序(之前已对其进行了强化)。很遗憾,Python没有遵循这种趋势,并且没有提供禁用添加的简单方法。到sys.path

好吧,这演示了可能的问题-(在Windows中删除引号):

echo "import sys; print(sys.version)" > pdb.py

python -m pdb
3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]

使用该-I标志可将其锁定在生产环境中(版本3.4中的新增功能):

python -Im pdb
usage: pdb.py [-c command] ... pyfile [arg] ...
etc...

文档

-I

在隔离模式下运行Python。这也意味着-E和-s。在隔离模式下,sys.path既不包含脚本的目录也不包含用户的site-packages目录。所有PYTHON *环境变量也将被忽略。可能会施加进一步的限制,以防止用户注入恶意代码。

怎么__package__办?

它启用了显式相对导入,但与该问题并不特别相关-请在此处查看此答案:Python中“ __package__”属性的目的是什么?

Execution of Python code with -m option or not

Use the -m flag.

The results are pretty much the same when you have a script, but when you develop a package, without the -m flag, there’s no way to get the imports to work correctly if you want to run a subpackage or module in the package as the main entry point to your program (and believe me, I’ve tried.)

The docs

Like the docs on the -m flag say:

Search sys.path for the named module and execute its contents as the __main__ module.

and

As with the -c option, the current directory will be added to the start of sys.path.

so

python -m pdb

is roughly equivalent to

python /usr/lib/python3.5/pdb.py

(assuming you don’t have a package or script in your current directory called pdb.py)

Explanation:

Behavior is made “deliberately similar to” scripts.

Many standard library modules contain code that is invoked on their execution as a script. An example is the timeit module:

Some python code is intended to be run as a module: (I think this example is better than the commandline option doc example)

$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python -m timeit '"-".join([str(n) for n in range(100)])'
10000 loops, best of 3: 33.4 usec per loop
$ python -m timeit '"-".join(map(str, range(100)))'
10000 loops, best of 3: 25.2 usec per loop

And from the release note highlights for Python 2.4:

The -m command line option – python -m modulename will find a module in the standard library, and invoke it. For example, python -m pdb is equivalent to python /usr/lib/python2.4/pdb.py

Follow-up Question

Also, David Beazley’s Python Essential Reference explains it as “The -m option runs a library module as a script which executes inside the __main__ module prior to the execution of the main script”.

It means any module you can lookup with an import statement can be run as the entry point of the program – if it has a code block, usually near the end, with if __name__ == '__main__':.

-m without adding the current directory to the path:

A comment here elsewhere says:

That the -m option also adds the current directory to sys.path, is obviously a security issue (see: preload attack). This behavior is similar to library search order in Windows (before it had been hardened recently). It’s a pity that Python does not follow the trend and does not offer a simple way to disable adding . to sys.path

Well, this demonstrates the possible issue – (in windows remove the quotes):

echo "import sys; print(sys.version)" > pdb.py

python -m pdb
3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]

Use the -I flag to lock this down for production environments (new in version 3.4):

python -Im pdb
usage: pdb.py [-c command] ... pyfile [arg] ...
etc...

from the docs:

-I

Run Python in isolated mode. This also implies -E and -s. In isolated mode sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too. Further restrictions may be imposed to prevent the user from injecting malicious code.

What does __package__ do?

It enables explicit relative imports, not particularly germane to this question, though – see this answer here: What’s the purpose of the “__package__” attribute in Python?


回答 2

使用-m将模块(或程序包)作为脚本运行的主要原因是简化部署,尤其是在Windows上。您可以将脚本安装在模块通常可以使用的Python库中的同一位置-而不污染PATH或〜/ .local等全局可执行目录(在Windows中很难找到每个用户的脚本目录)。

然后,您只需键入-m,Python就会自动找到该脚本。例如,python -m pip将为执行它的同一Python解释器实例找到正确的点。如果没有-m,那么如果用户安装了多个Python版本,哪个是“全局” pip?

如果用户更喜欢命令行脚本的“经典”入口点,则可以轻松地将它们作为小脚本添加到PATH中的某个位置,或者pip可以在安装时使用setup.py中的entry_points参数创建它们。

因此,只需检查__name__ == '__main__'并忽略其他不可靠的实现细节。

The main reason to run a module (or package) as a script with -m is to simplify deployment, especially on Windows. You can install scripts in the same place in the Python library where modules normally go – instead of polluting PATH or global executable directories such as ~/.local (the per-user scripts directory is ridiculously hard to find in Windows).

Then you just type -m and Python finds the script automagically. For example, python -m pip will find the correct pip for the same instance of Python interpreter which executes it. Without -m, if user has several Python versions installed, which one would be the “global” pip?

If user prefers “classic” entry points for command-line scripts, these can be easily added as small scripts somewhere in PATH, or pip can create these at install time with entry_points parameter in setup.py.

So just check for __name__ == '__main__' and ignore other non-reliable implementation details.


如何从Python包内部读取(静态)文件?

问题:如何从Python包内部读取(静态)文件?

您能告诉我如何读取Python包中的文件吗?

我的情况

我加载的程序包具有许多模板(要用作程序的文本文件),我想从程序中加载它们。但是,如何指定此类文件的路径?

想象一下我想从以下位置读取文件:

package\templates\temp_file

某种路径操纵?包基本路径跟踪?

Could you tell me how can I read a file that is inside my Python package?

My situation

A package that I load has a number of templates (text files used as strings) that I want to load from within the program. But how do I specify the path to such file?

Imagine I want to read a file from:

package\templates\temp_file

Some kind of path manipulation? Package base path tracking?


回答 0

[添加2016-06-15:显然,这并非在所有情况下都有效。请参阅其他答案]


import os, mypackage
template = os.path.join(mypackage.__path__[0], 'templates', 'temp_file')

[added 2016-06-15: apparently this doesn’t work in all situations. please refer to the other answers]


import os, mypackage
template = os.path.join(mypackage.__path__[0], 'templates', 'temp_file')

回答 1

TLDR;使用标准库的importlib.resources模块,如下面方法2中所述。

不再推荐使用传统的 pkg_resourcesfromsetuptools,因为新方法:

  • 它的性能明显更高 ;
  • 这样做比较安全,因为使用软件包(而不是路径)会引起编译时错误;
  • 它更直观,因为您不必“加入”路径;
  • 由于不需要额外的依赖项(setuptools),因此开发时速度更快,而仅依赖于Python的标准库。

我将传统列在第一位,以在移植现有代码时解释新方法的区别(此处解释了移植)。



假设您的模板位于模块包内嵌套的文件夹中:

  <your-package>
    +--<module-asking-the-file>
    +--templates/
          +--temp_file                         <-- We want this file.

注意1:当然,我们不应该摆弄这个__file__属性(例如,从zip投放时代码会中断)。

注意2:如果您要构建此程序包,请记住将package_datadata_files中的数据文件隐藏起来setup.py

1)使用pkg_resourcessetuptools(慢)

您可以使用setuptools发行版中的pkg_resources软件包,但这会带来性能方面的成本

import pkg_resources

# Could be any dot-separated package/module name or a "Requirement"
resource_package = __name__
resource_path = '/'.join(('templates', 'temp_file'))  # Do not use os.path.join()
template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)

提示:

  • 这将读取的数据,即使您的分布压缩,所以你可以设置 zip_safe=True你的setup.py,和/或使用期待已久的zipapp打包机Python- 3.5打造自成体系的分布。

  • 记住要添加setuptools到您的运行时要求中(例如,在install_requires中)。

…,请注意,根据Setuptools / pkg_resourcesdocs,您不应使用os.path.join

基本资源访问

请注意,资源名称必须是- /分隔的路径,并且不能是绝对路径(即,没有前导/)或包含诸如“ ..”的相对名称。千万不能使用os.path程序来操作的资源路径,因为它们不是文件系统路径。

2)Python> = 3.7,或使用反向移植的importlib_resources

使用标准库的importlib.resources模块,该模块setuptools上面的效率更高:

try:
    import importlib.resources as pkg_resources
except ImportError:
    # Try backported to PY<37 `importlib_resources`.
    import importlib_resources as pkg_resources

from . import templates  # relative-import the *package* containing the templates

template = pkg_resources.read_text(templates, 'temp_file')
# or for a file-like stream:
template = pkg_resources.open_text(templates, 'temp_file')

注意:

关于功能read_text(package, resource)

  • package可以是一个字符串或模块。
  • resource不再被一个路径,但资源开放,现有的包内的不仅是文件名; 它可能不包含路径分隔符,并且可能没有子资源(即它不能是目录)。

对于问题中提出的示例,我们现在必须:

  • <your_package>/templates/ 通过__init__.py在其中创建一个空文件,将其制作成适当的软件包,
  • 所以现在我们可以使用一个简单的(可能是相对的)import语句(不再解析包/模块名称),
  • 并索要resource_name = "temp_file"(没有路径)。

提示:

  • 要访问当前模块内部的文件,请将package参数设置为__package__,例如pkg_resources.read_text(__package__, 'temp_file')(感谢@ ben-mares)。
  • 当事情变得有趣的实际文件名被要求用path()的,因为现在用于临时创建的文件(阅读上下文经理这个)。
  • 添加回迁库,有条件地为老年人Python,用install_requires=[" importlib_resources ; python_version<'3.7'"](检查这个,如果你用打包项目setuptools<36.2.1)。
  • 如果从传统方法迁移,请记住setuptools运行时要求中删除库。
  • 记住要定制setup.pyMANIFEST包括任何静态文件
  • 您也可以zip_safe=True在中设置setup.py

TLDR; Use standard-library’s importlib.resources module as explained in the method no 2, below.

The traditional pkg_resources from setuptools is not recommended anymore because the new method:

  • it is significantly more performant;
  • is is safer since the use of packages (instead of path-stings) raises compile-time errors;
  • it is more intuitive because you don’t have to “join” paths;
  • it is faster when developing since you don’t need an extra dependency (setuptools), but rely on Python’s standard-library alone.

I kept the traditional listed first, to explain the differences with the new method when porting existing code (porting also explained here).



Let’s assume your templates are located in a folder nested inside your module’s package:

  <your-package>
    +--<module-asking-the-file>
    +--templates/
          +--temp_file                         <-- We want this file.

Note 1: For sure, we should NOT fiddle with the __file__ attribute (e.g. code will break when served from a zip).

Note 2: If you are building this package, remember to declatre your data files as package_data or data_files in your setup.py.

1) Using pkg_resources from setuptools(slow)

You may use pkg_resources package from setuptools distribution, but that comes with a cost, performance-wise:

import pkg_resources

# Could be any dot-separated package/module name or a "Requirement"
resource_package = __name__
resource_path = '/'.join(('templates', 'temp_file'))  # Do not use os.path.join()
template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)

Tips:

  • This will read data even if your distribution is zipped, so you may set zip_safe=True in your setup.py, and/or use the long-awaited zipapp packer from python-3.5 to create self-contained distributions.

  • Remember to add setuptools into your run-time requirements (e.g. in install_requires`).

… and notice that according to the Setuptools/pkg_resources docs, you should not use os.path.join:

Basic Resource Access

Note that resource names must be /-separated paths and cannot be absolute (i.e. no leading /) or contain relative names like “..“. Do not use os.path routines to manipulate resource paths, as they are not filesystem paths.

2) Python >= 3.7, or using the backported importlib_resources library

Use the standard library’s importlib.resources module which is more efficient than setuptools, above:

try:
    import importlib.resources as pkg_resources
except ImportError:
    # Try backported to PY<37 `importlib_resources`.
    import importlib_resources as pkg_resources

from . import templates  # relative-import the *package* containing the templates

template = pkg_resources.read_text(templates, 'temp_file')
# or for a file-like stream:
template = pkg_resources.open_text(templates, 'temp_file')

Attention:

Regarding the function read_text(package, resource):

  • The package can be either a string or a module.
  • The resource is NOT a path anymore, but just the filename of the resource to open, within an existing package; it may not contain path separators and it may not have sub-resources (i.e. it cannot be a directory).

For the example asked in the question, we must now:

  • make the <your_package>/templates/ into a proper package, by creating an empty __init__.py file in it,
  • so now we can use a simple (possibly relative) import statement (no more parsing package/module names),
  • and simply ask for resource_name = "temp_file" (no path).

Tips:

  • To access a file inside the current module, set the package argument to __package__, e.g. pkg_resources.read_text(__package__, 'temp_file') (thanks to @ben-mares).
  • Things become interesting when an actual filename is asked with path(), since now context-managers are used for temporarily-created files (read this).
  • Add the backported library, conditionally for older Pythons, with install_requires=[" importlib_resources ; python_version<'3.7'"] (check this if you package your project with setuptools<36.2.1).
  • Remember to remove setuptools library from your runtime-requirements, if you migrated from the traditional method.
  • Remember to customize setup.py or MANIFEST to include any static files.
  • You may also set zip_safe=True in your setup.py.

回答 2

包装前奏:

在甚至不必担心读取资源文件之前,第一步就是要确保首先将数据文件打包到您的发行版中-可以很容易地直接从源代码树中读取它们,但重要的是确保可以从已安装的软件包中的代码访问这些资源文件。

这样构造项目,将数据文件放入包中的子目录

.
├── package
   ├── __init__.py
   ├── templates
      └── temp_file
   ├── mymodule1.py
   └── mymodule2.py
├── README.rst
├── MANIFEST.in
└── setup.py

你应该通过include_package_data=Truesetup()呼叫。仅当您要使用setuptools / distutils并构建源分发版时,才需要清单文件。为了确保templates/temp_file此示例项目结构的打包内容得到打包,请在清单文件中添加如下一行:

recursive-include package *

历史记录注释: 对于 flit,poetry等现代构建后端不需要使用清单文件,默认情况下将包括包数据文件。因此,如果您正在使用pyproject.toml并且没有setup.py文件,则可以忽略有关的所有内容MANIFEST.in

现在,不用包装,放在阅读部分上…

建议:

使用标准库pkgutilAPI。在库代码中将如下所示:

# within package/mymodule1.py, for example
import pkgutil

data = pkgutil.get_data(__name__, "templates/temp_file")
print("data:", repr(data))
text = pkgutil.get_data(__name__, "templates/temp_file").decode()
print("text:", repr(text))

它可以使用拉链。它适用于Python 2和Python3。它不需要第三方依赖。我真的不知道有什么弊端(如果您愿意,请在答案上发表评论)。

避免的坏方法:

坏方法#1:使用源文件中的相对路径

这是目前公认的答案。充其量看起来像这样:

from pathlib import Path

resource_path = Path(__file__).parent / "templates"
data = resource_path.joinpath("temp_file").read_bytes()
print("data", repr(data))

怎么了 您拥有可用文件和子目录的假设是不正确的。如果执行打包在zip或wheel中的代码,则此方法不起作用,并且是否将包完全提取到文件系统中可能完全不受用户控制。

坏方法2:使用pkg_resources API

投票最多的答案对此进行了描述。看起来像这样:

from pkg_resources import resource_string

data = resource_string(__name__, "templates/temp_file")
print("data", repr(data))

怎么了 它在setuptools上添加了运行时依赖关系,最好仅是安装时间依赖关系。即使代码只对您自己的软件包资源感兴趣,导入和使用也会变得非常缓慢,因为代码会建立所有已安装软件包的工作集。在安装时这没什么大不了的(因为安装是一次性的),但是在运行时却很难看。pkg_resources

坏方法#3:使用importlib.resources API

目前,这是投票最多的答案中的建议。这是最近标准库的新增功能(Python 3.7中的新增功能),但是也有一个反向端口。看起来像这样:

try:
    from importlib.resources import read_binary
    from importlib.resources import read_text
except ImportError:
    # Python 2.x backport
    from importlib_resources import read_binary
    from importlib_resources import read_text

data = read_binary("package.templates", "temp_file")
print("data", repr(data))
text = read_text("package.templates", "temp_file")
print("text", repr(text))

怎么了 好吧,不幸的是,这还行不通… 这仍然是一个不完整的API,使用importlib.resources它将需要您添加一个空文件templates/__init__.py,以便数据文件位于子包中而不是子目录中。它还会自行将package/templates子目录显示为可导入package.templates子包。如果这没什么大不了的,并且不会打扰您,那么您可以继续在__init__.py此处添加文件,然后使用导入系统访问资源。但是,当您使用它时,也可以将其放入my_resources.py文件中,只需在模块中定义一些字节或字符串变量,然后将其导入Python代码即可。无论哪种方式,都是进口系统在做繁重的工作。

示例项目:

我已经在github上创建了一个示例项目,并上传到PyPI上,该项目演示了上面讨论的所有四种方法。试试看:

$ pip install resources-example
$ resources-example

有关更多信息,请参见https://github.com/wimglenn/resources-example

A packaging prelude:

Before you can even worry about reading resource files, the first step is to make sure that the data files are getting packaged into your distribution in the first place – it is easy to read them directly from the source tree, but the important part is making sure these resource files are accessible from code within an installed package.

Structure your project like this, putting data files into a subdirectory within the package:

.
├── package
│   ├── __init__.py
│   ├── templates
│   │   └── temp_file
│   ├── mymodule1.py
│   └── mymodule2.py
├── README.rst
├── MANIFEST.in
└── setup.py

You should pass include_package_data=True in the setup() call. The manifest file is only needed if you want to use setuptools/distutils and build source distributions. To make sure the templates/temp_file gets packaged for this example project structure, add a line like this into the manifest file:

recursive-include package *

Historical cruft note: Using a manifest file is not needed for modern build backends such as flit, poetry, which will include the package data files by default. So, if you’re using pyproject.toml and you don’t have a setup.py file then you can ignore all the stuff about MANIFEST.in.

Now, with packaging out of the way, onto the reading part…

Recommendation:

Use standard library pkgutil APIs. It’s going to look like this in library code:

# within package/mymodule1.py, for example
import pkgutil

data = pkgutil.get_data(__name__, "templates/temp_file")

It works in zips. It works on Python 2 and Python 3. It doesn’t require third-party dependencies. I’m not really aware of any downsides (if you are, then please comment on the answer).

Bad ways to avoid:

Bad way #1: using relative paths from a source file

This is currently the accepted answer. At best, it looks something like this:

from pathlib import Path

resource_path = Path(__file__).parent / "templates"
data = resource_path.joinpath("temp_file").read_bytes()

What’s wrong with that? The assumption that you have files and subdirectories available is not correct. This approach doesn’t work if executing code which is packed in a zip or a wheel, and it may be entirely out of the user’s control whether or not your package gets extracted to a filesystem at all.

Bad way #2: using pkg_resources APIs

This is described in the top-voted answer. It looks something like this:

from pkg_resources import resource_string

data = resource_string(__name__, "templates/temp_file")

What’s wrong with that? It adds a runtime dependency on setuptools, which should preferably be an install time dependency only. Importing and using pkg_resources can become really slow, as the code builds up a working set of all installed packages, even though you were only interested in your own package resources. That’s not a big deal at install time (since installation is once-off), but it’s ugly at runtime.

Bad way #3: using importlib.resources APIs

This is currently the recommendation in the top-voted answer. It’s a recent standard library addition (new in Python 3.7). It looks like this:

from importlib.resources import read_binary

data = read_binary("package.templates", "temp_file")

What’s wrong with that? Well, unfortunately, it doesn’t work…yet. This is still an incomplete API, using importlib.resources will require you to add an empty file templates/__init__.py in order that the data files will reside within a sub-package rather than in a subdirectory. It will also expose the package/templates subdirectory as an importable package.templates sub-package in its own right. If that’s not a big deal and it doesn’t bother you, then you can go ahead and add the __init__.py file there and use the import system to access resources. However, while you’re at it you may as well make it into a my_resources.py file instead, and just define some bytes or string variables in the module, then import them in Python code. It’s the import system doing the heavy lifting here either way.

Honorable mention: using newer importlib_resources APIs

This has not been mentioned in any other answers yet, but importlib_resources is more than a simple backport of the Python 3.7+ importlib.resources code. It has traversable APIs which you can use like this:

import importlib_resources

my_resources = importlib_resources.files("package")
data = (my_resources / "templates" / "temp_file").read_bytes()

This works on Python 2 and 3, it works in zips, and it doesn’t require spurious __init__.py files to be added in resource subdirectories. The only downside vs pkgutil that I can see is that these new APIs haven’t yet arrived in stdlib, so there is still a third-party dependency. Newer APIs from importlib_resources should arrive to stdlib importlib.resources in Python 3.9.

Example project:

I’ve created an example project on github and uploaded on PyPI, which demonstrates all five approaches discussed above. Try it out with:

$ pip install resources-example
$ resources-example

See https://github.com/wimglenn/resources-example for more info.


回答 3

如果你有这个结构

lidtk
├── bin
   └── lidtk
├── lidtk
   ├── analysis
      ├── char_distribution.py
      └── create_cm.py
   ├── classifiers
      ├── char_dist_metric_train_test.py
      ├── char_features.py
      ├── cld2
         ├── cld2_preds.txt
         └── cld2wili.py
      ├── get_cld2.py
      ├── text_cat
         ├── __init__.py
         ├── README.md   <---------- say you want to get this
         └── textcat_ngram.py
      └── tfidf_features.py
   ├── data
      ├── __init__.py
      ├── create_ml_dataset.py
      ├── download_documents.py
      ├── language_utils.py
      ├── pickle_to_txt.py
      └── wili.py
   ├── __init__.py
   ├── get_predictions.py
   ├── languages.csv
   └── utils.py
├── README.md
├── setup.cfg
└── setup.py

您需要以下代码:

import pkg_resources

# __name__ in case you're within the package
# - otherwise it would be 'lidtk' in this example as it is the package name
path = 'classifiers/text_cat/README.md'  # always use slash
filepath = pkg_resources.resource_filename(__name__, path)

奇怪的“总是使用斜杠”部分来自setuptoolsAPI

还要注意,如果使用路径,则即使在Windows上,也必须使用正斜杠(/)作为路径分隔符。Setuptools在生成时自动将斜杠转换为适当的特定于平台的分隔符

如果您想知道文档在哪里:

In case you have this structure

lidtk
├── bin
│   └── lidtk
├── lidtk
│   ├── analysis
│   │   ├── char_distribution.py
│   │   └── create_cm.py
│   ├── classifiers
│   │   ├── char_dist_metric_train_test.py
│   │   ├── char_features.py
│   │   ├── cld2
│   │   │   ├── cld2_preds.txt
│   │   │   └── cld2wili.py
│   │   ├── get_cld2.py
│   │   ├── text_cat
│   │   │   ├── __init__.py
│   │   │   ├── README.md   <---------- say you want to get this
│   │   │   └── textcat_ngram.py
│   │   └── tfidf_features.py
│   ├── data
│   │   ├── __init__.py
│   │   ├── create_ml_dataset.py
│   │   ├── download_documents.py
│   │   ├── language_utils.py
│   │   ├── pickle_to_txt.py
│   │   └── wili.py
│   ├── __init__.py
│   ├── get_predictions.py
│   ├── languages.csv
│   └── utils.py
├── README.md
├── setup.cfg
└── setup.py

you need this code:

import pkg_resources

# __name__ in case you're within the package
# - otherwise it would be 'lidtk' in this example as it is the package name
path = 'classifiers/text_cat/README.md'  # always use slash
filepath = pkg_resources.resource_filename(__name__, path)

The strange “always use slash” part comes from setuptools APIs

Also notice that if you use paths, you must use a forward slash (/) as the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time

In case you wonder where the documentation is:


回答 4

David Beazley和Brian K. Jones撰写的Python Cookbook第三版“ 10.8。读取包中的数据文件”中的内容给出了答案。

我将它送到这里:

假设您有一个软件包,其文件组织如下:

mypackage/
    __init__.py
    somedata.dat
    spam.py

现在,假设文件spam.py要读取文件somedata.dat的内容。为此,请使用以下代码:

import pkgutil
data = pkgutil.get_data(__package__, 'somedata.dat')

结果变量数据将是一个字节字符串,其中包含文件的原始内容。

get_data()的第一个参数是包含程序包名称的字符串。您可以直接提供它,也可以使用特殊变量,例如__package__。第二个参数是包中文件的相对名称。如有必要,您可以使用标准Unix文件名约定浏览到其他目录,只要最终目录仍位于包中即可。

这样,该软件包可以安装为目录,.zip或.egg。

The content in “10.8. Reading Datafiles Within a Package” of Python Cookbook, Third Edition by David Beazley and Brian K. Jones giving the answers.

I’ll just get it to here:

Suppose you have a package with files organized as follows:

mypackage/
    __init__.py
    somedata.dat
    spam.py

Now suppose the file spam.py wants to read the contents of the file somedata.dat. To do it, use the following code:

import pkgutil
data = pkgutil.get_data(__package__, 'somedata.dat')

The resulting variable data will be a byte string containing the raw contents of the file.

The first argument to get_data() is a string containing the package name. You can either supply it directly or use a special variable, such as __package__. The second argument is the relative name of the file within the package. If necessary, you can navigate into different directories using standard Unix filename conventions as long as the final directory is still located within the package.

In this way, the package can installed as directory, .zip or .egg.


回答 5

包中的每个python模块都有一个__file__属性

您可以将其用作:

import os 
from mypackage

templates_dir = os.path.join(os.path.dirname(mypackage.__file__), 'templates')
template_file = os.path.join(templates_dir, 'template.txt')

有关鸡蛋资源,请参见:http : //peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources

Every python module in your package has a __file__ attribute

You can use it as:

import os 
from mypackage

templates_dir = os.path.join(os.path.dirname(mypackage.__file__), 'templates')
template_file = os.path.join(templates_dir, 'template.txt')

For egg resources see: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources


回答 6

假设您使用的是鸡蛋文件;未提取:

我通过使用后安装脚本在最近的项目中“解决”了该问题,该脚本将我的模板从egg(zip文件)提取到文件系统中的正确目录。这是我发现的最快,最可靠的解决方案,因为__path__[0]有时使用会出错(我不记得这个名称了,但是我至少浏览了一个库,在列表的前面增加了一些东西!)。

通常,鸡蛋文件通常也被即时提取到一个称为“鸡蛋缓存”的临时位置。您可以在启动脚本之前甚至以后使用环境变量来更改该位置。

os.environ['PYTHON_EGG_CACHE'] = path

但是,有pkg_resources可能会正确完成此工作。

assuming you are using an egg file; not extracted:

I “solved” this in a recent project, by using a postinstall script, that extracts my templates from the egg (zip file) to the proper directory in the filesystem. It was the quickest, most reliable solution I found, since working with __path__[0] can go wrong sometimes (i don’t recall the name, but i cam across at least one library, that added something in front of that list!).

Also egg files are usually extracted on the fly to a temporary location called the “egg cache”. You can change that location using an environment variable, either before starting your script or even later, eg.

os.environ['PYTHON_EGG_CACHE'] = path

However there is pkg_resources that might do the job properly.


有没有一种标准的方法可以在软件包中列出Python模块的名称?

问题:有没有一种标准的方法可以在软件包中列出Python模块的名称?

有没有一种简单的方法可以列出软件包中所有模块的名称,而无需使用__all__

例如,给定此程序包:

/testpkg
/testpkg/__init__.py
/testpkg/modulea.py
/testpkg/moduleb.py

我想知道是否有标准或内置的方式来做这样的事情:

>>> package_contents("testpkg")
['modulea', 'moduleb']

手动方法是遍历模块搜索路径,以找到包的目录。然后可以列出该目录中的所有文件,过滤出唯一命名为py / pyc / pyo的文件,剥离扩展名,然后返回该列表。但这对于模块导入机制已经在内部完成的工作来说似乎是相当多的工作。该功能在任何地方都可以使用吗?

Is there a straightforward way to list the names of all modules in a package, without using __all__?

For example, given this package:

/testpkg
/testpkg/__init__.py
/testpkg/modulea.py
/testpkg/moduleb.py

I’m wondering if there is a standard or built-in way to do something like this:

>>> package_contents("testpkg")
['modulea', 'moduleb']

The manual approach would be to iterate through the module search paths in order to find the package’s directory. One could then list all the files in that directory, filter out the uniquely-named py/pyc/pyo files, strip the extensions, and return that list. But this seems like a fair amount of work for something the module import mechanism is already doing internally. Is that functionality exposed anywhere?


回答 0

也许这会满足您的需求?

import imp
import os
MODULE_EXTENSIONS = ('.py', '.pyc', '.pyo')

def package_contents(package_name):
    file, pathname, description = imp.find_module(package_name)
    if file:
        raise ImportError('Not a package: %r', package_name)
    # Use a set because some may be both source and compiled.
    return set([os.path.splitext(module)[0]
        for module in os.listdir(pathname)
        if module.endswith(MODULE_EXTENSIONS)])

Maybe this will do what you’re looking for?

import imp
import os
MODULE_EXTENSIONS = ('.py', '.pyc', '.pyo')

def package_contents(package_name):
    file, pathname, description = imp.find_module(package_name)
    if file:
        raise ImportError('Not a package: %r', package_name)
    # Use a set because some may be both source and compiled.
    return set([os.path.splitext(module)[0]
        for module in os.listdir(pathname)
        if module.endswith(MODULE_EXTENSIONS)])

回答 1

使用python2.3及更高版本,您还可以使用以下pkgutil模块:

>>> import pkgutil
>>> [name for _, name, _ in pkgutil.iter_modules(['testpkg'])]
['modulea', 'moduleb']

编辑:请注意,该参数不是模块列表,而是路径列表,因此您可能需要执行以下操作:

>>> import os.path, pkgutil
>>> import testpkg
>>> pkgpath = os.path.dirname(testpkg.__file__)
>>> print [name for _, name, _ in pkgutil.iter_modules([pkgpath])]

Using python2.3 and above, you could also use the pkgutil module:

>>> import pkgutil
>>> [name for _, name, _ in pkgutil.iter_modules(['testpkg'])]
['modulea', 'moduleb']

EDIT: Note that the parameter is not a list of modules, but a list of paths, so you might want to do something like this:

>>> import os.path, pkgutil
>>> import testpkg
>>> pkgpath = os.path.dirname(testpkg.__file__)
>>> print [name for _, name, _ in pkgutil.iter_modules([pkgpath])]

回答 2

import module
help(module)
import module
help(module)

回答 3

不知道我是在忽略什么,还是答案只是过时而已;

如user815423426所述,这仅适用于活动对象,并且列出的模块仅是之前导入的模块。

使用inspect列出软件包中的模块似乎真的很容易:

>>> import inspect, testpkg
>>> inspect.getmembers(testpkg, inspect.ismodule)
['modulea', 'moduleb']

Don’t know if I’m overlooking something, or if the answers are just out-dated but;

As stated by user815423426 this only works for live objects and the listed modules are only modules that were imported before.

Listing modules in a package seems really easy using inspect:

>>> import inspect, testpkg
>>> inspect.getmembers(testpkg, inspect.ismodule)
['modulea', 'moduleb']

回答 4

这是适用于python 3.6及更高版本的递归版本:

import importlib.util
from pathlib import Path
import os
MODULE_EXTENSIONS = '.py'

def package_contents(package_name):
    spec = importlib.util.find_spec(package_name)
    if spec is None:
        return set()

    pathname = Path(spec.origin).parent
    ret = set()
    with os.scandir(pathname) as entries:
        for entry in entries:
            if entry.name.startswith('__'):
                continue
            current = '.'.join((package_name, entry.name.partition('.')[0]))
            if entry.is_file():
                if entry.name.endswith(MODULE_EXTENSIONS):
                    ret.add(current)
            elif entry.is_dir():
                ret.add(current)
                ret |= package_contents(current)


    return ret

This is a recursive version that works with python 3.6 and above:

import importlib.util
from pathlib import Path
import os
MODULE_EXTENSIONS = '.py'

def package_contents(package_name):
    spec = importlib.util.find_spec(package_name)
    if spec is None:
        return set()

    pathname = Path(spec.origin).parent
    ret = set()
    with os.scandir(pathname) as entries:
        for entry in entries:
            if entry.name.startswith('__'):
                continue
            current = '.'.join((package_name, entry.name.partition('.')[0]))
            if entry.is_file():
                if entry.name.endswith(MODULE_EXTENSIONS):
                    ret.add(current)
            elif entry.is_dir():
                ret.add(current)
                ret |= package_contents(current)


    return ret

回答 5

根据cdleary的示例,这是所有子模块的递归版本列表路径:

import imp, os

def iter_submodules(package):
    file, pathname, description = imp.find_module(package)
    for dirpath, _, filenames in os.walk(pathname):
        for  filename in filenames:
            if os.path.splitext(filename)[1] == ".py":
                yield os.path.join(dirpath, filename)

Based on cdleary’s example, here’s a recursive version listing path for all submodules:

import imp, os

def iter_submodules(package):
    file, pathname, description = imp.find_module(package)
    for dirpath, _, filenames in os.walk(pathname):
        for  filename in filenames:
            if os.path.splitext(filename)[1] == ".py":
                yield os.path.join(dirpath, filename)

回答 6

这应该列出模块:

help("modules")

This should list the modules:

help("modules")

回答 7

如果您想在python代码之外查看有关软件包的信息(从命令提示符),则可以使用pydoc。

# get a full list of packages that you have installed on you machine
$ python -m pydoc modules

# get information about a specific package
$ python -m pydoc <your package>

您将获得与pydoc相同的结果,但在解释器中使用help

>>> import <my package>
>>> help(<my package>)

If you would like to view an inforamtion about your package outside of the python code (from a command prompt) you can use pydoc for it.

# get a full list of packages that you have installed on you machine
$ python -m pydoc modules

# get information about a specific package
$ python -m pydoc <your package>

You will have the same result as pydoc but inside of interpreter using help

>>> import <my package>
>>> help(<my package>)

回答 8

def package_contents(package_name):
  package = __import__(package_name)
  return [module_name for module_name in dir(package) if not module_name.startswith("__")]
def package_contents(package_name):
  package = __import__(package_name)
  return [module_name for module_name in dir(package) if not module_name.startswith("__")]

回答 9

打印目录(模块)

print dir(module)


无法在Python中导入我自己的模块

问题:无法在Python中导入我自己的模块

我很难理解模块导入在Python中是如何工作的(我以前从未用任何其他语言来完成过此工作)。

假设我有:

myapp/__init__.py
myapp/myapp/myapp.py
myapp/myapp/SomeObject.py
myapp/tests/TestCase.py

现在,我试图得到这样的东西:

myapp.py
===================
from myapp import SomeObject
# stuff ...

TestCase.py
===================
from myapp import SomeObject
# some tests on SomeObject

但是,我肯定做错了,因为Python看不到这myapp是一个模块:

ImportError: No module named myapp

I’m having a hard time understanding how module importing works in Python (I’ve never done it in any other language before either).

Let’s say I have:

myapp/__init__.py
myapp/myapp/myapp.py
myapp/myapp/SomeObject.py
myapp/tests/TestCase.py

Now I’m trying to get something like this:

myapp.py
===================
from myapp import SomeObject
# stuff ...

TestCase.py
===================
from myapp import SomeObject
# some tests on SomeObject

However, I’m definitely doing something wrong as Python can’t see that myapp is a module:

ImportError: No module named myapp

回答 0

在您的特定情况下,您似乎正在尝试SomeObject从myapp.py和TestCase.py脚本导入。在myapp.py中,执行

import SomeObject

因为它在同一个文件夹中。对于TestCase.py,请执行

from ..myapp import SomeObject

但是,仅当您从软件包中导入TestCase时,此方法才有效。如果要直接运行python TestCase.py,则必须弄乱路径。这可以在Python中完成:

import sys
sys.path.append("..")
from myapp import SomeObject

尽管通常不建议这样做。

通常,如果您希望其他人使用您的Python软件包,则应使用distutils创建安装脚本。这样,任何人都可以使用像这样的命令轻松安装您的软件包,python setup.py install并且该软件包将在其计算机上的所有位置可用。如果您对软件包很认真,甚至可以将其添加到Python软件包索引PyPI中

In your particular case it looks like you’re trying to import SomeObject from the myapp.py and TestCase.py scripts. From myapp.py, do

import SomeObject

since it is in the same folder. For TestCase.py, do

from ..myapp import SomeObject

However, this will work only if you are importing TestCase from the package. If you want to directly run python TestCase.py, you would have to mess with your path. This can be done within Python:

import sys
sys.path.append("..")
from myapp import SomeObject

though that is generally not recommended.

In general, if you want other people to use your Python package, you should use distutils to create a setup script. That way, anyone can install your package easily using a command like python setup.py install and it will be available everywhere on their machine. If you’re serious about the package, you could even add it to the Python Package Index, PyPI.


回答 1

该函数import在PYTHONPATH env中查找文件。变量和您的本地目录。因此,您可以将所有文件放在同一目录中,也可以将键入的路径导出到终端中:

export PYTHONPATH="$PYTHONPATH:/path_to_myapp/myapp/myapp/"

The function import looks for files into your PYTHONPATH env. variable and your local directory. So you can either put all your files in the same directory, or export the path typing into a terminal::

export PYTHONPATH="$PYTHONPATH:/path_to_myapp/myapp/myapp/"

回答 2

导出路径是一个好方法。另一种方法是将.pth添加到您的站点包位置。在我的Mac上,我的python将站点包保存在/ Library / Python中,如下所示

/Library/Python/2.7/site-packages

我在/Library/Python/2.7/site-packages/awesome.pth创建了一个名为awesome.pth的文件,并在文件中放置了以下引用我的超赞模块的路径

/opt/awesome/custom_python_modules

exporting path is a good way. Another way is to add a .pth to your site-packages location. On my mac my python keeps site-packages in /Library/Python shown below

/Library/Python/2.7/site-packages

I created a file called awesome.pth at /Library/Python/2.7/site-packages/awesome.pth and in the file put the following path that references my awesome modules

/opt/awesome/custom_python_modules

回答 3

你可以试试

from myapp.myapp import SomeObject

因为您的项目名称与myapp.py相同,因此它会首先搜索项目文档

You can try

from myapp.myapp import SomeObject

because your project name is the same as the myapp.py which makes it search the project document first


回答 4

在您的第一个myapp目录中,u可以添加setup.py文件,并在setup.py中添加两个python代码

from setuptools import setup
setup(name='myapp')

在命令行的第一个myapp目录中,使用pip install -e。安装软件包

In your first myapp directory ,u can add a setup.py file and add two python code in setup.py

from setuptools import setup
setup(name='myapp')

in your first myapp directory in commandline , use pip install -e . to install the package


回答 5

pip installWindows 10上的默认设置为安装在“ Program Files / PythonXX / Lib / site-packages”中,该目录需要管理权限。因此,我通过以管理员身份运行pip install解决了我的问题 (即使您使用管理员帐户登录,也必须以管理员身份打开命令提示符)。另外,从python调用pip更安全。
例如
python -m pip install <package-name>
代替
pip install <package-name>

pip install on Windows 10 defaults to installing in ‘Program Files/PythonXX/Lib/site-packages’ which is a directory that requires administrative privileges. So I fixed my issue by running pip install as Administrator (you have to open command prompt as administrator even if you are logged in with an admin account). Also, it is safer to call pip from python.
e.g.
python -m pip install <package-name>
instead of
pip install <package-name>


回答 6

就我而言,尽管Windows文件名不区分大小写,但Python导入却使Windows vs Python感到惊讶。因此,如果您有Stuff.py文件,则需要按原样导入此名称。

In my case it was Windows vs Python surprise, despite Windows filenames are not case sensitive, Python import is. So if you have Stuff.py file you need to import this name as-is.


回答 7

你需要

__init__.py

在所有您需要与之交互的代码的文件夹中。即使您尝试导入的文件处于同一级别,也需要在每次导入时指定项目的顶级文件夹名称。

You need to have

__init__.py

in all the folders that have code you need to interact with. You also need to specify the top folder name of your project in every import even if the file you tried to import is at the same level.


如何在Python中创建命名空间包?

问题:如何在Python中创建命名空间包?

在Python中,命名空间包可让您在多个项目中传播Python代码。当您要将相关的库作为单独的下载发布时,这很有用。例如,目录Package-1Package-2PYTHONPATH

Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py

最终用户可以import namespace.module1import namespace.module2

定义命名空间包的最佳方法是什么,以便多个Python产品可以在该命名空间中定义模块?

In Python, a namespace package allows you to spread Python code among several projects. This is useful when you want to release related libraries as separate downloads. For example, with the directories Package-1 and Package-2 in PYTHONPATH,

Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py

the end-user can import namespace.module1 and import namespace.module2.

What’s the best way to define a namespace package so more than one Python product can define modules in that namespace?


回答 0

TL; DR:

在Python 3.3上,您无需执行任何操作,只需将任何内容都不放在__init__.py命名空间包目录中就可以了。在3.3之前的版本中,请选择一种pkgutil.extend_path()解决方案pkg_resources.declare_namespace(),因为它是面向未来的并且已经与隐式命名空间包兼容。


Python 3.3引入了隐式命名空间包,请参阅PEP 420

这意味着一个对象现在可以创建三种类型的对象import foo

  • foo.py文件代表的模块
  • 常规软件包,由foo包含__init__.py文件的目录表示
  • 一个命名空间包,由一个或多个目录表示,foo没有任何__init__.py文件

包也是模块,但是当我说“模块”时,我的意思是“非包模块”。

首先,它扫描sys.path模块或常规软件包。如果成功,它将停止搜索并创建并初始化模块或程序包。如果没有找到模块或常规包,但是找到了至少一个目录,它将创建并初始化一个命名空间包。

模块和常规软件包已__file__设置.py为创建它们的文件。常规和命名空间包已__path__设置为创建它们的目录。

完成此操作后import foo.bar,将首先针对进行上述搜索foo,然后如果找到了软件包,bar则使用foo.__path__搜索路径而不是进行搜索sys.path。如果foo.bar找到,foofoo.bar创建和初始化。

那么常规软件包和命名空间软件包如何混合使用?通常它们不会,但是旧的pkgutil显式命名空间包方法已扩展为包括隐式命名空间包。

如果您已有这样的常规软件包__init__.py

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

…遗留行为是在搜索到的路径中将其他任何常规软件包添加到其__path__。但是在Python 3.3中,它也添加了命名空间包。

因此,您可以具有以下目录结构:

├── path1
   └── package
       ├── __init__.py
       └── foo.py
├── path2
   └── package
       └── bar.py
└── path3
    └── package
        ├── __init__.py
        └── baz.py

……只要两个__init__.pyextend_path行(和path1path2path3在你的sys.pathimport package.fooimport package.bar并且import package.baz将所有的工作。

pkg_resources.declare_namespace(__name__) 尚未更新为包括隐式命名空间包。

TL;DR:

On Python 3.3 you don’t have to do anything, just don’t put any __init__.py in your namespace package directories and it will just work. On pre-3.3, choose the pkgutil.extend_path() solution over the pkg_resources.declare_namespace() one, because it’s future-proof and already compatible with implicit namespace packages.


Python 3.3 introduces implicit namespace packages, see PEP 420.

This means there are now three types of object that can be created by an import foo:

  • A module represented by a foo.py file
  • A regular package, represented by a directory foo containing an __init__.py file
  • A namespace package, represented by one or more directories foo without any __init__.py files

Packages are modules too, but here I mean “non-package module” when I say “module”.

First it scans sys.path for a module or regular package. If it succeeds, it stops searching and creates and initalizes the module or package. If it found no module or regular package, but it found at least one directory, it creates and initializes a namespace package.

Modules and regular packages have __file__ set to the .py file they were created from. Regular and namespace packages have __path__set to the directory or directories they were created from.

When you do import foo.bar, the above search happens first for foo, then if a package was found, the search for bar is done with foo.__path__as the search path instead of sys.path. If foo.bar is found, foo and foo.bar are created and initialized.

So how do regular packages and namespace packages mix? Normally they don’t, but the old pkgutil explicit namespace package method has been extended to include implicit namespace packages.

If you have an existing regular package that has an __init__.py like this:

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

… the legacy behavior is to add any other regular packages on the searched path to its __path__. But in Python 3.3, it also adds namespace packages.

So you can have the following directory structure:

├── path1
│   └── package
│       ├── __init__.py
│       └── foo.py
├── path2
│   └── package
│       └── bar.py
└── path3
    └── package
        ├── __init__.py
        └── baz.py

… and as long as the two __init__.py have the extend_path lines (and path1, path2 and path3 are in your sys.path) import package.foo, import package.bar and import package.baz will all work.

pkg_resources.declare_namespace(__name__) has not been updated to include implicit namespace packages.


回答 1

有一个称为pkgutil的标准模块,您可以使用该模块将模块“附加”到给定的命名空间。

使用目录结构,您可以提供:

Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py

你应该把在这两个两行Package-1/namespace/__init__.pyPackage-2/namespace/__init__.py(*):

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

(*由于-除非您声明它们之间的依赖关系-您不知道将首先识别其中的哪个- 有关更多信息,请参见PEP 420

文档所述

这将添加到以包命名__path__的目录的所有目录子目录中sys.path

从现在开始,您应该能够独立分发这两个软件包。

There’s a standard module, called pkgutil, with which you can ‘append’ modules to a given namespace.

With the directory structure you’ve provided:

Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py

You should put those two lines in both Package-1/namespace/__init__.py and Package-2/namespace/__init__.py (*):

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

(* since -unless you state a dependency between them- you don’t know which of them will be recognized first – see PEP 420 for more information)

As the documentation says:

This will add to the package’s __path__ all subdirectories of directories on sys.path named after the package.

From now on, you should be able to distribute those two packages independently.


回答 2

本节应该很不言自明。

简而言之,将命名空间代码放入中__init__.py,进行更新setup.py以声明一个命名空间,您可以随意使用。

This section should be pretty self-explanatory.

In short, put the namespace code in __init__.py, update setup.py to declare a namespace, and you are free to go.


回答 3

这是一个古老的问题,但是最近有人在我的博客上评论说,我有关命名空间包的帖子仍然有意义,因此我想在这里链接到它,因为它提供了如何实现它的实用示例:

https://web.archive.org/web/20150425043954/http://cdent.tumblr.com/post/216241761/python-namespace-packages-for-tiddlyweb

链接到本文以了解发生的主要事情:

http://www.siafoo.net/article/77#multiple-distributions-one-virtual-package

__import__("pkg_resources").declare_namespace(__name__)技巧是相当多的驱动器在插件管理TiddlyWeb和迄今似乎是工作了。

This is an old question, but someone recently commented on my blog that my posting about namespace packages was still relevant, so thought I would link to it here as it provides a practical example of how to make it go:

https://web.archive.org/web/20150425043954/http://cdent.tumblr.com/post/216241761/python-namespace-packages-for-tiddlyweb

That links to this article for the main guts of what’s going on:

http://www.siafoo.net/article/77#multiple-distributions-one-virtual-package

The __import__("pkg_resources").declare_namespace(__name__) trick is pretty much drives the management of plugins in TiddlyWeb and thus far seems to be working out.


回答 4

您已经将Python命名空间概念放到了最前面,在python中无法将包放入模块中。软件包中包含模块,而不是相反。

Python包只是包含__init__.py文件的文件夹。模块是包中(或直接在上PYTHONPATH)具有.py扩展名的任何其他文件。因此,在您的示例中,您有两个包,但未定义任何模块。如果您认为软件包是文件系统文件夹,而模块是文件,那么您会明白为什么软件包包含模块,而不是相反。

因此,在示例中,假设Package-1和Package-2是放在Python路径上的文件系统上的文件夹,则可以具有以下内容:

Package-1/
  namespace/
  __init__.py
  module1.py
Package-2/
  namespace/
  __init__.py
  module2.py

你现在有一个包namespace有两个模块module1module2。除非您有充分的理由,否则您应该将模块放在文件夹中,并且仅将其放在python路径中,如下所示:

Package-1/
  namespace/
  __init__.py
  module1.py
  module2.py

You have your Python namespace concepts back to front, it is not possible in python to put packages into modules. Packages contain modules not the other way around.

A Python package is simply a folder containing a __init__.py file. A module is any other file in a package (or directly on the PYTHONPATH) that has a .py extension. So in your example you have two packages but no modules defined. If you consider that a package is a file system folder and a module is file then you see why packages contain modules and not the other way around.

So in your example assuming Package-1 and Package-2 are folders on the file system that you have put on the Python path you can have the following:

Package-1/
  namespace/
  __init__.py
  module1.py
Package-2/
  namespace/
  __init__.py
  module2.py

You now have one package namespace with two modules module1 and module2. and unless you have a good reason you should probably put the modules in the folder and have only that on the python path like below:

Package-1/
  namespace/
  __init__.py
  module1.py
  module2.py

Python 3.3+中的软件包不需要__init__.py吗

问题:Python 3.3+中的软件包不需要__init__.py吗

我正在使用Python 3.5.1。我在这里阅读了文档和包部分:https : //docs.python.org/3/tutorial/modules.html#packages

现在,我具有以下结构:

/home/wujek/Playground/a/b/module.py

module.py

class Foo:
    def __init__(self):
        print('initializing Foo')

现在,在/home/wujek/Playground

~/Playground $ python3
>>> import a.b.module
>>> a.b.module.Foo()
initializing Foo
<a.b.module.Foo object at 0x100a8f0b8>

同样,现在在家里,超级文件夹Playground

~ $ PYTHONPATH=Playground python3
>>> import a.b.module
>>> a.b.module.Foo()
initializing Foo
<a.b.module.Foo object at 0x10a5fee10>

实际上,我可以做各种事情:

~ $ PYTHONPATH=Playground python3
>>> import a
>>> import a.b
>>> import Playground.a.b

为什么这样做?我虽然__init__.py两者都需要文件(空文件可以工作),a并且bmodule.py在Python路径指向Playground文件夹时可导入?

这似乎与Python 2.7有所不同:

~ $ PYTHONPATH=Playground python
>>> import a
ImportError: No module named a
>>> import a.b
ImportError: No module named a.b
>>> import a.b.module
ImportError: No module named a.b.module

随着__init__.py在这两个~/Playground/a~/Playground/a/b它工作正常。

I am using Python 3.5.1. I read the document and the package section here: https://docs.python.org/3/tutorial/modules.html#packages

Now, I have the following structure:

/home/wujek/Playground/a/b/module.py

module.py:

class Foo:
    def __init__(self):
        print('initializing Foo')

Now, while in /home/wujek/Playground:

~/Playground $ python3
>>> import a.b.module
>>> a.b.module.Foo()
initializing Foo
<a.b.module.Foo object at 0x100a8f0b8>

Similarly, now in home, superfolder of Playground:

~ $ PYTHONPATH=Playground python3
>>> import a.b.module
>>> a.b.module.Foo()
initializing Foo
<a.b.module.Foo object at 0x10a5fee10>

Actually, I can do all kinds of stuff:

~ $ PYTHONPATH=Playground python3
>>> import a
>>> import a.b
>>> import Playground.a.b

Why does this work? I though there needed to be __init__.py files (empty ones would work) in both a and b for module.py to be importable when the Python path points to the Playground folder?

This seems to have changed from Python 2.7:

~ $ PYTHONPATH=Playground python
>>> import a
ImportError: No module named a
>>> import a.b
ImportError: No module named a.b
>>> import a.b.module
ImportError: No module named a.b.module

With __init__.py in both ~/Playground/a and ~/Playground/a/b it works fine.


回答 0

Python 3.3+具有隐式命名空间包,允许它创建不带__init__.py文件的包。

允许隐式命名空间包意味着可以完全放弃提供__init__.py文件的要求,并使其受到影响。

__init__.py文件的旧方法仍然可以在Python 2中使用。

Python 3.3+ has Implicit Namespace Packages that allow it to create a packages without an __init__.py file.

Allowing implicit namespace packages means that the requirement to provide an __init__.py file can be dropped completely, and affected … .

The old way with __init__.py files still works as in Python 2.


回答 1

重要

@Mike的回答是正确的,但过于精确。确实,Python 3.3+支持隐式命名空间包,这允许它创建不带__init__.py文件的包。

但是,这仅适用于EMPTY__init__.py文件。因此,EMPTY__init__.py文件不再是必需的,可以省略。如果要在导入软件包或其任何模块或子软件包时运行特定的初始化脚本,则仍然需要__init__.py文件。这是一个很好的Stack Overflow答案,它说明了为什么您想使用__init__.py文件进行一些进一步的初始化,以防您想知道为什么这样做有任何用处。

目录结构示例:

  parent_package/
     __init__.py            <- EMPTY, NOT NECESSARY in Python 3.3+
     child_package/
          __init__.py       <- STILL REQUIRED if you want to run an initialization script
          child1.py
          child2.py
          child3.py

parent_package/child_package/__init__.py

print("from parent")

例子

以下示例演示了如何在执行初始化脚本时执行初始化脚本。 child_package导入或其中一个模块。

范例1

from parent_package import child_package  # prints "from parent"

范例2

from parent_package.child_package import child1  # prints "from parent"

IMPORTANT

@Mike’s answer is correct but too imprecise. It is true that Python 3.3+ supports Implicit Namespace Packages that allows it to create a package without an __init__.py file.

This however, ONLY applies to EMPTY __init__.py files. So EMPTY __init__.py files are no longer necessary and can be omitted. If you want to run a particular initialization script when the package or any of its modules or sub-packages are imported, you still require an __init__.py file. This is a great Stack Overflow answer for why you would want to use an __init__.py file to do some further initialization in case you wondering why this is in any way useful.

Directory Structure Example:

  parent_package/
     __init__.py            <- EMPTY, NOT NECESSARY in Python 3.3+
     child_package/
          __init__.py       <- STILL REQUIRED if you want to run an initialization script
          child1.py
          child2.py
          child3.py

parent_package/child_package/__init__.py:

print("from parent")

EXAMPLES

The below examples demonstrate how the initialization script is executed when the child_package or one of its modules is imported.

Example 1:

from parent_package import child_package  # prints "from parent"

Example 2:

from parent_package.child_package import child1  # prints "from parent"

回答 2

如果你有 setup.py在项目中并在其中使用find_packages(),则必须__init__.py在每个目录中都有一个文件,以便自动找到软件包。

程序包仅在包含__init__.py文件的情况下才被识别

UPD:如果你想使用隐式命名空间包,而__init__.py你只需要使用find_namespace_packages()替代

文件

If you have setup.py in your project and you use find_packages() within it, it is necessary to have an __init__.py file in every directory for packages to be automatically found.

Packages are only recognized if they include an __init__.py file

UPD: If you want to use implicit namespace packages without __init__.py you just have to use find_namespace_packages() instead

Docs


回答 3

我要说的是,__init__.py只有一个人想要拥有隐式命名空间包时,才应该省略它。如果您不知道这意味着什么,则可能不想要它,因此您应该__init__.py在Python 3中继续使用even。

I would say that one should omit the __init__.py only if one wants to have the implicit namespace package. If you don’t know what it means, you probably don’t want it and therefore you should continue to use the __init__.py even in Python 3.