是否使用-m选项执行Python代码

问题:是否使用-m选项执行Python代码

python解释器的-m 模块选项为“将库模块模块作为脚本运行”。

使用此python代码a.py:

if __name__ == "__main__":
    print __package__
    print __name__

我测试python -m a

"" <-- Empty String
__main__

python a.py回报

None <-- None
__main__

对我来说,这两个调用似乎是相同的,只是当使用-m选项调用__package__时不为None。

有趣的是,有了python -m runpy a,我得到了与python -m a编译成a.pyc的python模块相同的东西。

这些调用之间的(实际)区别是什么?他们之间有什么利弊吗?

同样,David Beazley的Python Essential Reference将其解释为“ -m选项将库模块作为脚本运行,该脚本在执行主脚本之前在__main__模块内部执行 ”。这是什么意思?

The python interpreter has -m module option that “Runs library module module as a script”.

With this python code a.py:

if __name__ == "__main__":
    print __package__
    print __name__

I tested python -m a to get

"" <-- Empty String
__main__

whereas python a.py returns

None <-- None
__main__

To me, those two invocation seems to be the same except __package__ is not None when invoked with -m option.

Interestingly, with python -m runpy a, I get the same as python -m a with python module compiled to get a.pyc.

What’s the (practical) difference between these invocations? Any pros and cons between them?

Also, David Beazley’s Python Essential Reference explains it as “The -m option runs a library module as a script which executes inside the __main__ module prior to the execution of the main script“. What does it mean?


回答 0

当您使用-m命令行标志时,Python将为您导入模块或包,然后将其作为脚本运行。当您不使用该-m标志时,您命名的文件仅作为脚本运行。

当您尝试运行软件包时,区别很重要。之间有很大的区别:

python foo/bar/baz.py

python -m foo.bar.baz

与后一种情况一样,foo.bar将导入,并且相对导入将foo.bar作为起点正确运行。

演示:

$ mkdir -p test/foo/bar
$ touch test/foo/__init__.py
$ touch test/foo/bar/__init__.py
$ cat << EOF > test/foo/bar/baz.py 
> if __name__ == "__main__":
>     print __package__
>     print __name__
> 
> EOF
$ PYTHONPATH=test python test/foo/bar/baz.py 
None
__main__
$ PYTHONPATH=test python -m foo.bar.baz 
foo.bar
__main__

结果,在使用-m开关时,Python实际上必须关心软件包。普通脚本永远不能是软件包,因此__package__将其设置为None

但运行一个封装或模块与包裹-m和现在至少存在可能性的封装的,所以__package__变量设置为一个字符串值; 在上面的演示中,将其设置为foo.bar,对于不在包内的普通模块,将其设置为空字符串。

至于__main__ 模块 ; Python会像常规模块一样导入正在运行的脚本。创建一个新的模块对象来保存存储在中的全局命名空间sys.modules['__main__']。这就是__name__变量所指的,它是该结构中的关键。

对于包,您可以创建一个__main__.py模块并在运行时让其运行python -m package_name;其实这是你的唯一途径可以运行包的脚本:

$ PYTHONPATH=test python -m foo.bar
python: No module named foo.bar.__main__; 'foo.bar' is a package and cannot be directly executed
$ cp test/foo/bar/baz.py test/foo/bar/__main__.py
$ PYTHONPATH=test python -m foo.bar
foo.bar
__main__

因此,在命名要与一起运行的包时-m,Python会查找__main__该包中包含的模块并将其作为脚本执行。然后,其名称仍设置为__main__,并且模块对象仍存储在中sys.modules['__main__']

When you use the -m command-line flag, Python will import a module or package for you, then run it as a script. When you don’t use the -m flag, the file you named is run as just a script.

The distinction is important when you try to run a package. There is a big difference between:

python foo/bar/baz.py

and

python -m foo.bar.baz

as in the latter case, foo.bar is imported and relative imports will work correctly with foo.bar as the starting point.

Demo:

$ mkdir -p test/foo/bar
$ touch test/foo/__init__.py
$ touch test/foo/bar/__init__.py
$ cat << EOF > test/foo/bar/baz.py 
> if __name__ == "__main__":
>     print __package__
>     print __name__
> 
> EOF
$ PYTHONPATH=test python test/foo/bar/baz.py 
None
__main__
$ PYTHONPATH=test python -m foo.bar.baz 
foo.bar
__main__

As a result, Python has to actually care about packages when using the -m switch. A normal script can never be a package, so __package__ is set to None.

But run a package or module inside a package with -m and now there is at least the possibility of a package, so the __package__ variable is set to a string value; in the above demonstration it is set to 'foo.bar', for plain modules not inside a package it is set to an empty string.

As for the __main__ module, Python imports scripts being run as it would import regular modules. A new module object is created to hold the global namespace and is stored in sys.modules['__main__']. This is what the __name__ variable refers to, it is a key in that structure.

For packages, you can create a __main__.py module inside and have that run when running python -m package_name; in fact that is the only way you can run a package as a script:

$ PYTHONPATH=test python -m foo.bar
python: No module named foo.bar.__main__; 'foo.bar' is a package and cannot be directly executed
$ cp test/foo/bar/baz.py test/foo/bar/__main__.py
$ PYTHONPATH=test python -m foo.bar
foo.bar
__main__

So, when naming a package for running with -m, Python looks for a __main__ module contained in that package and executes that as a script. Its name is then still set to '__main__' and the module object is still stored in sys.modules['__main__'].


回答 1

是否使用-m选项执行Python代码

使用-m标志。

当您拥有脚本时,结果几乎是相同的,但是当您开发一个没有-m标志的软件包时,如果您想运行软件包中的子软件包或模块作为主条目,则无法使导入正常工作指向您的程序(相信我,我已经尝试过了。)

该文档

就像-m标志上文档说的那样:

在sys.path中搜索指定的模块,并作为__main__模块执行其内容。

与-c选项一样,当前目录将添加到sys.path的开头。

所以

python -m pdb

大致相当于

python /usr/lib/python3.5/pdb.py

(假设您在当前目录中没有名为pdb.py的软件包或脚本)

说明:

使行为“故意类似于”脚本。

许多标准库模块包含在执行时作为脚本调用的代码。一个例子是timeit模块:

某些python代码旨在作为模块运行:(我认为此示例比命令行选项doc示例更好)

$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python -m timeit '"-".join([str(n) for n in range(100)])'
10000 loops, best of 3: 33.4 usec per loop
$ python -m timeit '"-".join(map(str, range(100)))'
10000 loops, best of 3: 25.2 usec per loop

并且从发行说明中突出显示了Python 2.4

-m命令行选项-python -m modulename将在标准库中找到一个模块,然后调用它。例如,python -m pdb 相当于python /usr/lib/python2.4/pdb.py

后续问题

同样,David Beazley的Python Essential Reference将其解释为“ -m选项将库模块作为脚本运行,该脚本__main__在执行主脚本之前在模块内部执行”。

这意味着您可以使用import语句查找的任何模块都可以作为程序的入口点运行-如果该模块具有代码块(通常在结尾处使用)if __name__ == '__main__':

-m 而不将当前目录添加到路径:

其他地方的评论说:

-m选项还将当前目录添加到sys.path中,显然是一个安全问题(请参阅:预加载攻击)。此行为类似于Windows中的库搜索顺序(之前已对其进行了强化)。很遗憾,Python没有遵循这种趋势,并且没有提供禁用添加的简单方法。到sys.path

好吧,这演示了可能的问题-(在Windows中删除引号):

echo "import sys; print(sys.version)" > pdb.py

python -m pdb
3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]

使用该-I标志可将其锁定在生产环境中(版本3.4中的新增功能):

python -Im pdb
usage: pdb.py [-c command] ... pyfile [arg] ...
etc...

文档

-I

在隔离模式下运行Python。这也意味着-E和-s。在隔离模式下,sys.path既不包含脚本的目录也不包含用户的site-packages目录。所有PYTHON *环境变量也将被忽略。可能会施加进一步的限制,以防止用户注入恶意代码。

怎么__package__办?

它启用了显式相对导入,但与该问题并不特别相关-请在此处查看此答案:Python中“ __package__”属性的目的是什么?

Execution of Python code with -m option or not

Use the -m flag.

The results are pretty much the same when you have a script, but when you develop a package, without the -m flag, there’s no way to get the imports to work correctly if you want to run a subpackage or module in the package as the main entry point to your program (and believe me, I’ve tried.)

The docs

Like the docs on the -m flag say:

Search sys.path for the named module and execute its contents as the __main__ module.

and

As with the -c option, the current directory will be added to the start of sys.path.

so

python -m pdb

is roughly equivalent to

python /usr/lib/python3.5/pdb.py

(assuming you don’t have a package or script in your current directory called pdb.py)

Explanation:

Behavior is made “deliberately similar to” scripts.

Many standard library modules contain code that is invoked on their execution as a script. An example is the timeit module:

Some python code is intended to be run as a module: (I think this example is better than the commandline option doc example)

$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python -m timeit '"-".join([str(n) for n in range(100)])'
10000 loops, best of 3: 33.4 usec per loop
$ python -m timeit '"-".join(map(str, range(100)))'
10000 loops, best of 3: 25.2 usec per loop

And from the release note highlights for Python 2.4:

The -m command line option – python -m modulename will find a module in the standard library, and invoke it. For example, python -m pdb is equivalent to python /usr/lib/python2.4/pdb.py

Follow-up Question

Also, David Beazley’s Python Essential Reference explains it as “The -m option runs a library module as a script which executes inside the __main__ module prior to the execution of the main script”.

It means any module you can lookup with an import statement can be run as the entry point of the program – if it has a code block, usually near the end, with if __name__ == '__main__':.

-m without adding the current directory to the path:

A comment here elsewhere says:

That the -m option also adds the current directory to sys.path, is obviously a security issue (see: preload attack). This behavior is similar to library search order in Windows (before it had been hardened recently). It’s a pity that Python does not follow the trend and does not offer a simple way to disable adding . to sys.path

Well, this demonstrates the possible issue – (in windows remove the quotes):

echo "import sys; print(sys.version)" > pdb.py

python -m pdb
3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]

Use the -I flag to lock this down for production environments (new in version 3.4):

python -Im pdb
usage: pdb.py [-c command] ... pyfile [arg] ...
etc...

from the docs:

-I

Run Python in isolated mode. This also implies -E and -s. In isolated mode sys.path contains neither the script’s directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too. Further restrictions may be imposed to prevent the user from injecting malicious code.

What does __package__ do?

It enables explicit relative imports, not particularly germane to this question, though – see this answer here: What’s the purpose of the “__package__” attribute in Python?


回答 2

使用-m将模块(或程序包)作为脚本运行的主要原因是简化部署,尤其是在Windows上。您可以将脚本安装在模块通常可以使用的Python库中的同一位置-而不污染PATH或〜/ .local等全局可执行目录(在Windows中很难找到每个用户的脚本目录)。

然后,您只需键入-m,Python就会自动找到该脚本。例如,python -m pip将为执行它的同一Python解释器实例找到正确的点。如果没有-m,那么如果用户安装了多个Python版本,哪个是“全局” pip?

如果用户更喜欢命令行脚本的“经典”入口点,则可以轻松地将它们作为小脚本添加到PATH中的某个位置,或者pip可以在安装时使用setup.py中的entry_points参数创建它们。

因此,只需检查__name__ == '__main__'并忽略其他不可靠的实现细节。

The main reason to run a module (or package) as a script with -m is to simplify deployment, especially on Windows. You can install scripts in the same place in the Python library where modules normally go – instead of polluting PATH or global executable directories such as ~/.local (the per-user scripts directory is ridiculously hard to find in Windows).

Then you just type -m and Python finds the script automagically. For example, python -m pip will find the correct pip for the same instance of Python interpreter which executes it. Without -m, if user has several Python versions installed, which one would be the “global” pip?

If user prefers “classic” entry points for command-line scripts, these can be easily added as small scripts somewhere in PATH, or pip can create these at install time with entry_points parameter in setup.py.

So just check for __name__ == '__main__' and ignore other non-reliable implementation details.