为什么要编译Python代码?

问题:为什么要编译Python代码?

为什么要编译Python脚本?您可以直接从.py文件运行它们,并且效果很好,那么在性能上有什么优势吗?

我还注意到,我的应用程序中的某些文件被编译为.pyc,而另一些则没有,为什么?

Why would you compile a Python script? You can run them directly from the .py file and it works fine, so is there a performance advantage or something?

I also notice that some files in my application get compiled into .pyc while others do not, why is this?


回答 0

它被编译为字节码,可以使用得越来越快。

未编译某些文件的原因是,python main.py每次运行脚本时都会重新编译与之一起调用的主脚本。所有导入的脚本将被编译并存储在磁盘上。

Ben Blank的重要补充:

值得注意的是,虽然运行已编译的脚本的启动 时间更快(因为它不需要编译),但运行速度不会更快。

It’s compiled to bytecode which can be used much, much, much faster.

The reason some files aren’t compiled is that the main script, which you invoke with python main.py is recompiled every time you run the script. All imported scripts will be compiled and stored on the disk.

Important addition by Ben Blank:

It’s worth noting that while running a compiled script has a faster startup time (as it doesn’t need to be compiled), it doesn’t run any faster.


回答 1

.pyc文件是已编译为字节码的Python。如果Python找到与您调用的.py文件同名的.pyc文件,它将自动运行.pyc文件。

“一个Python简介” 这有关编译Python文件:

从’.pyc’或’.pyo’文件中读取程序比从’.py’文件中读取程序运行得更快。关于“ .pyc”或“ .pyo”文件,唯一更快的是它们的加载速度。

运行.pyc文件的优点是Python不必在运行之前承担编译该文件的开销。由于Python无论如何都将在运行.py文件之前编译为字节码,因此除此以外不应有任何性能改进。

使用编译的.pyc文件可以带来多少改进?这取决于脚本的功能。对于仅打印“ Hello World”的非常简短的脚本,编译可能会占启动和运行总时间的很大一部分。但是,对于运行时间较长的脚本,相对于总运行时间而言,编译脚本的成本会降低。

您在命令行上命名的脚本永远不会保存到.pyc文件中。这样,仅保存由该“主”脚本加载的模块。

The .pyc file is Python that has already been compiled to byte-code. Python automatically runs a .pyc file if it finds one with the same name as a .py file you invoke.

“An Introduction to Python” says this about compiled Python files:

A program doesn’t run any faster when it is read from a ‘.pyc’ or ‘.pyo’ file than when it is read from a ‘.py’ file; the only thing that’s faster about ‘.pyc’ or ‘.pyo’ files is the speed with which they are loaded.

The advantage of running a .pyc file is that Python doesn’t have to incur the overhead of compiling it before running it. Since Python would compile to byte-code before running a .py file anyway, there shouldn’t be any performance improvement aside from that.

How much improvement can you get from using compiled .pyc files? That depends on what the script does. For a very brief script that simply prints “Hello World,” compiling could constitute a large percentage of the total startup-and-run time. But the cost of compiling a script relative to the total run time diminishes for longer-running scripts.

The script you name on the command-line is never saved to a .pyc file. Only modules loaded by that “main” script are saved in that way.


回答 2

优点:

第一:轻度,可击败的迷惑。

第二:如果编译生成的文件小得多,则加载时间将更快。非常适合网络使用。

第三:Python可以跳过编译步骤。初始负载更快。非常适合CPU和网络。

第四:您发表的评论越多,.pyc.pyo文件与源.py文件相比就越小。

第五:只有.pyc或个.pyo文件的最终用户不太可能向您呈现由他们忘了告诉您的未还原更改引起的错误。

第六:如果您的目标是嵌入式系统,则获得较小尺寸的文件进行嵌入可能会带来很大的好处,并且该体系结构稳定,因此下面将详细介绍的缺点一不起作用。

顶级编译

知道您可以通过.pyc以下方式将顶级python源文件编译成文件是很有用的:

python -m py_compile myscript.py

这将删除注释。它docstrings完好无损。如果您也想摆脱它docstrings(您可能想认真考虑一下为什么这样做),那么可以这样编译…

python -OO -m py_compile myscript.py

…您会得到一个.pyo文件而不是一个.pyc文件;在代码的基本功能方面可以平均分配,但是在剥离后的大小上较小docstrings(如果一开始就很体面的话,就不太容易理解,以后再使用docstrings)。但请参见下面的缺点三。

请注意,python使用.py文件的日期(如果存在)来决定是否应执行.py文件而不是.pycor的.pyo文件—因此,请编辑.py文件,而.pycor .pyo会过时,并且您获得的任何收益都会丢失。您需要重新编译它,以便再次获得.pyc.pyo收益,例如恢复原样。

缺点:

第一:在.pyc.pyo文件中有一个“魔术cookie” ,指示python文件在其中编译的系统体系结构。如果将其中一个文件分发到其他类型的环境中,则会损坏。如果您分发.pyc.pyo而不包含.py要重新编译的关联项,touch那么它将取代.pyc.pyo,最终用户也无法修复它。

第二:如果如上所述docstrings通过使用-OO命令行选项被跳过,则没有人能够获得该信息,这会使代码的使用更加困难(或不可能)。

第三:Python的-OO选项还根据-O命令行选项实现了一些优化。这可能会导致操作发生变化。已知的优化是:

  • sys.flags.optimize = 1
  • assert 语句被跳过
  • __debug__ =错误

第四:如果您故意使python脚本具有#!/usr/bin/python第一行中的顺序可执行文件,则会删除其中的内容.pyc.pyo文件,并且会丢失功能。

第五点:显而易见,但是如果您编译代码,不仅会影响代码的使用,而且通常会严重降低其他人从您的工作中学习的潜力。

Pluses:

First: mild, defeatable obfuscation.

Second: if compilation results in a significantly smaller file, you will get faster load times. Nice for the web.

Third: Python can skip the compilation step. Faster at intial load. Nice for the CPU and the web.

Fourth: the more you comment, the smaller the .pyc or .pyo file will be in comparison to the source .py file.

Fifth: an end user with only a .pyc or .pyo file in hand is much less likely to present you with a bug they caused by an un-reverted change they forgot to tell you about.

Sixth: if you’re aiming at an embedded system, obtaining a smaller size file to embed may represent a significant plus, and the architecture is stable so drawback one, detailed below, does not come into play.

Top level compilation

It is useful to know that you can compile a top level python source file into a .pyc file this way:

python -m py_compile myscript.py

This removes comments. It leaves docstrings intact. If you’d like to get rid of the docstrings as well (you might want to seriously think about why you’re doing that) then compile this way instead…

python -OO -m py_compile myscript.py

…and you’ll get a .pyo file instead of a .pyc file; equally distributable in terms of the code’s essential functionality, but smaller by the size of the stripped-out docstrings (and less easily understood for subsequent employment if it had decent docstrings in the first place). But see drawback three, below.

Note that python uses the .py file’s date, if it is present, to decide whether it should execute the .py file as opposed to the .pyc or .pyo file — so edit your .py file, and the .pyc or .pyo is obsolete and whatever benefits you gained are lost. You need to recompile it in order to get the .pyc or .pyo benefits back again again, such as they may be.

Drawbacks:

First: There’s a “magic cookie” in .pyc and .pyo files that indicates the system architecture that the python file was compiled in. If you distribute one of these files into an environment of a different type, it will break. If you distribute the .pyc or .pyo without the associated .py to recompile or touch so it supersedes the .pyc or .pyo, the end user can’t fix it, either.

Second: If docstrings are skipped with the use of the -OO command line option as described above, no one will be able to get at that information, which can make use of the code more difficult (or impossible.)

Third: Python’s -OO option also implements some optimizations as per the -O command line option; this may result in changes in operation. Known optimizations are:

  • sys.flags.optimize = 1
  • assert statements are skipped
  • __debug__ = False

Fourth: if you had intentionally made your python script executable with something on the order of #!/usr/bin/python on the first line, this is stripped out in .pyc and .pyo files and that functionality is lost.

Fifth: somewhat obvious, but if you compile your code, not only can its use be impacted, but the potential for others to learn from your work is reduced, often severely.


回答 3

运行已编译的python可以提高性能。但是,当您将.py文件作为导入的模块运行时,python将会编译并存储该文件,只要.py文件没有更改,它将始终使用编译后的版本。

在使用文件时,使用任何交错语言,过程看起来都是这样的:
1.文件由交错器处理。
2.编译文件
。3.执行已编译的代码。

显然,通过使用预编译的代码,您可以消除第2步,这适用于python,PHP等。

这是一篇有趣的博客文章,解释了它们之间的差异http://julipedia.blogspot.com/2004/07/compiled-vs-interpreted-languages.html
这是一个解释Python编译过程的条目,网址为http://effbot.org/zone /python-compile.htm

There is a performance increase in running compiled python. However when you run a .py file as an imported module, python will compile and store it, and as long as the .py file does not change it will always use the compiled version.

With any interpeted language when the file is used the process looks something like this:
1. File is processed by the interpeter.
2. File is compiled
3. Compiled code is executed.

obviously by using pre-compiled code you can eliminate step 2, this applies python, PHP and others.

Heres an interesting blog post explaining the differences http://julipedia.blogspot.com/2004/07/compiled-vs-interpreted-languages.html
And here’s an entry that explains the Python compile process http://effbot.org/zone/python-compile.htm


回答 4

如前所述,将python代码编译成字节码可以提高性能。这通常由python本身处理,仅适用于导入的脚本。

您可能要编译python代码的另一个原因可能是保护您的知识产权不被复制和/或修改。

您可以在Python文档中阅读有关此内容的更多信息。

As already mentioned, you can get a performance increase from having your python code compiled into bytecode. This is usually handled by python itself, for imported scripts only.

Another reason you might want to compile your python code, could be to protect your intellectual property from being copied and/or modified.

You can read more about this in the Python documentation.


回答 5

运行已编译的脚本时肯定会有性能差异。如果运行普通.py脚本,则计算机在每次运行时都会对其进行编译,这会花费一些时间。在现代机器上,这几乎不明显,但是随着脚本的增长,它可能会成为更多问题。

There’s certainly a performance difference when running a compiled script. If you run normal .py scripts, the machine compiles it every time it is run and this takes time. On modern machines this is hardly noticeable but as the script grows it may become more of an issue.


回答 6

没有涉及到的是源到源编译。例如,nuitka将Python代码转换为C / C ++,并将其编译为直接在CPU上运行的二进制代码,而不是在较慢的虚拟机上运行的Python字节码。

这可以导致明显的加速,或者可以让您在环境依赖C / C ++代码的情况下使用Python。

Something not touched upon is source-to-source-compiling. For example, nuitka translates Python code to C/C++, and compiles it to binary code which directly runs on the CPU, instead of Python bytecode which runs on the slower virtual machine.

This can lead to significant speedups, or it would let you work with Python while your environment depends on C/C++ code.


回答 7

我们使用编译后的代码来分发给无法访问源代码的用户。基本上是为了阻止没有经验的程序员在不告诉我们的情况下意外更改某些内容或修复错误。

We use compiled code to distribute to users who do not have access to the source code. Basically to stop inexperienced programers accidentally changing something or fixing bugs without telling us.


回答 8

是的,表现是主要原因,据我所知,这是唯一的原因。

如果您的某些文件没有得到编译,则可能是由于目录权限之类的原因,Python无法写入.pyc文件。或者,也许从未加载过未编译的文件…(脚本/模块仅在首次加载时才进行编译)

Yep, performance is the main reason and, as far as I know, the only reason.

If some of your files aren’t getting compiled, maybe Python isn’t able to write to the .pyc file, perhaps because of the directory permissions or something. Or perhaps the uncompiled files just aren’t ever getting loaded… (scripts/modules only get compiled when they first get loaded)


回答 9

初学者认为Python是因为.pyc文件而编译的。.pyc文件是编译后的字节码,然后对其进行解释。因此,如果您之前已经运行过Python代码并拥有.pyc文件,那么它第二次运行会更快,因为它不必重新编译字节码

编译器: 编译器是将高级语言翻译成机器语言的一段代码

口译员: 口译员还将高级语言转换为机器可读的二进制等效项。每次解释器获得要执行的高级语言代码时,它都会先将代码转换为中间代码,然后再将其转换为机器代码。解释代码的每个部分,然后按顺序分别执行,并且在代码的一部分中发现错误,它将停止对代码的解释,而无需翻译下一组代码。

资料来源: http : //www.toptal.com/python/why-are-there-so-many-pythons http://www.engineersgarage.com/contribution/difference-between-compiler-and-interpreter

Beginners assume Python is compiled because of .pyc files. The .pyc file is the compiled bytecode, which is then interpreted. So if you’ve run your Python code before and have the .pyc file handy, it will run faster the second time, as it doesn’t have to re-compile the bytecode

compiler: A compiler is a piece of code that translates the high level language into machine language

Interpreters: Interpreters also convert the high level language into machine readable binary equivalents. Each time when an interpreter gets a high level language code to be executed, it converts the code into an intermediate code before converting it into the machine code. Each part of the code is interpreted and then execute separately in a sequence and an error is found in a part of the code it will stop the interpretation of the code without translating the next set of the codes.

Sources: http://www.toptal.com/python/why-are-there-so-many-pythons http://www.engineersgarage.com/contribution/difference-between-compiler-and-interpreter