为什么标准Python中不包含PyPy?

问题:为什么标准Python中不包含PyPy?

我在看PyPy,我只是想知道为什么它没有被主流Python发行版所采用。诸如JIT编译和较低的内存占用量之类的方法不会大大提高所有Python代码的速度吗?

简而言之,PyPy的主要缺陷是什么?

I was looking at PyPy and I was just wondering why it hasn’t been adopted into the mainline Python distributions. Wouldn’t things like JIT compilation and lower memory footprint greatly improve the speeds of all Python code?

In short, what are the main drawbacks of PyPy that cause it to remain a separate project?


回答 0

PyPy不是CPython的分支,因此永远不能将其直接合并到CPython中。

从理论上讲,Python社区可以普遍采用PyPy,可以将PyPy用作参考实现,而可以停止CPython。但是,PyPy有其自身的弱点:

  • CPython易于与用C编写的Python模块集成,这是传统上Python应用程序处理CPU密集型任务的方式(例如,参见SciPy项目)。
  • PyPy JIT编译步骤本身要花费CPU时间-仅通过重复运行已编译的代码,它才能整体上更快。这意味着启动时间可能会更长,因此PyPy对于运行胶水代码或琐碎的脚本不一定有效。
  • PyPy和CPython行为在所有方面都不完全相同,尤其是涉及“实现细节”时(该行为不是语言指定的,但在实际水平上仍然很重要)。
  • CPython比PyPy可以在更多的体系结构上运行,并且已经成功地适应了以PyPy不可行的方式在嵌入式体系结构中运行。
  • CPython的内存管理参考计数方案可以说比PyPy的各种GC系统具有更可预测的性能影响,尽管不一定对所有“纯GC”策略都如此。
  • PyPy尚未完全支持Python 3.x,尽管这是一个活跃的工作项目。

PyPy是一个很棒的项目,但是CPU密集型任务的运行速度并不是全部,在许多应用程序中,它是许多关注中最少的。例如,Django可以在PyPy上运行,这使得模板化更快,但是CPython的数据库驱动程序比PyPy的更快。最后,哪种实现方式更有效取决于给定应用程序的瓶颈所在。

另一个例子:您认为PyPy非常适合游戏,但是大多数GC策略(例如PyPy中使用的GC策略)都会引起明显的抖动。对于CPython,大多数占用大量CPU的游戏资源都已转移到PyGame库中,因为PyGame主要是作为C扩展实现的,所以PyPy无法利用(尽管参见:pygame-cffi)。我仍然认为PyPy可以成为游戏的绝佳平台,但我从未见过它的实际用途。

PyPy和CPython在基本设计问题上有根本不同的方法,并会做出不同的权衡,因此在每种情况下,两者都不比另一个“更好”。

PyPy is not a fork of CPython, so it could never be merged directly into CPython.

Theoretically the Python community could universally adopt PyPy, PyPy could be made the reference implementation, and CPython could be discontinued. However, PyPy has its own weaknesses:

  • CPython is easy to integrate with Python modules written in C, which is traditionally the way Python applications have handled CPU-intensive tasks (see for instance the SciPy project).
  • The PyPy JIT compilation step itself costs CPU time — it’s only through repeated running of compiled code that it becomes faster overall. This means startup times can be higher, and therefore PyPy isn’t necessarily as efficient for running glue code or trivial scripts.
  • PyPy and CPython behavior is not identical in all respects, especially when it comes to “implementation details” (behavior that is not specified by the language but is still important at a practical level).
  • CPython runs on more architectures than PyPy and has been successfully adapted to run in embedded architectures in ways that may be impractical for PyPy.
  • CPython’s reference counting scheme for memory management arguably has more predictable performance impacts than PyPy’s various GC systems, although this isn’t necessarily true of all “pure GC” strategies.
  • PyPy does not yet fully support Python 3.x, although that is an active work item.

PyPy is a great project, but runtime speed on CPU-intensive tasks isn’t everything, and in many applications it’s the least of many concerns. For instance, Django can run on PyPy and that makes templating faster, but CPython’s database drivers are faster than PyPy’s; in the end, which implementation is more efficient depends on where the bottleneck in a given application is.

Another example: you’d think PyPy would be great for games, but most GC strategies like those used in PyPy cause noticeable jitter. For CPython, most of the CPU-intensive game stuff is offloaded to the PyGame library, which PyPy can’t take advantage of since PyGame is primarily implemented as a C extension (though see: pygame-cffi). I still think PyPy can be a great platform for games, but I’ve never seen it actually used.

PyPy and CPython have radically different approaches to fundamental design questions and make different tradeoffs, so neither one is “better” than the other in every case.


回答 1

例如,它与Python 2.x 并非100%兼容,并且仅对3.x 具有初步支持

它也不是可以合并的东西-PyPy提供的Python实现是使用他们创建的框架生成的,该框架非常酷,但也与现有的CPython实现完全不同。它必须是一个完整的替代品。

PyPy和CPython之间有一些非常具体的区别,其中一个很大的区别就是扩展模块的支持方式-如果您想超越标准库,那就太重要了。

还值得注意的是,PyPy并非普遍都更快。

For one, it’s not 100% compatible with Python 2.x, and has only preliminary support for 3.x.

It’s also not something that could be merged – The Python implementation that is provided by PyPy is generated using a framework they have created, which is extremely cool, but also completely disparate with the existing CPython implementation. It would have to be a complete replacement.

There are some very concrete differences between PyPy and CPython, a big one being how extension modules are supported – which, if you want to go beyond the standard library, is a big deal.

It’s also worth noting that PyPy isn’t universally faster.


回答 2

观看Guido van Rossum的这段视频。他谈论您在12分33秒时问的相同问题。

强调:

  • 缺乏Python 3兼容性
  • 缺乏扩展支持
  • 不适合作为胶水代码
  • 速度不是一切

毕竟,他是决定的人…

See this video by Guido van Rossum. He talks about the same question you asked at 12 min 33 secs.

Highlights:

  • lack of Python 3 compatibility
  • lack of extension support
  • not appropriate as glue code
  • speed is not everything

After all, he’s the one to decide…


回答 3

根据PyPy网站的说法,一个原因可能是它目前仅在32位和64位Intel x86架构上运行,而CPython也可以在其他平台上运行。这可能是由于PyPy中特定于平台的速度增强所致。虽然速度是一件好事,但人们通常希望语言实现尽可能与“平台无关”。

One reason might be that according to PyPy site, it currently runs only on 32- and 64-bit Intel x86 architecture, while CPython runs on other platforms as well. This is probably due to platform-specific speed enhancements in PyPy. While speed is a good thing, people often want language implementations to be as “platform-independent” as possible.


回答 4

我建议观看David Beazley的主题演讲,以获取更多见解。它通过阐明PyPy的性质和复杂性来回答您的问题。

I recommend watching this keynote by David Beazley for more insights. It answers your question by giving clarity on nature & intricacies of PyPy.


回答 5

除了这里所说的一切之外,PyPy在错误方面还不如CPython坚如磐石。使用SymPy,在过去的几年中,我们发现了PyPy中大约有十二个错误,无论是发布版本还是夜间版本。

另一方面,我们在CPython中只发现了一个bug,而该bug在一个预发行版本中。

另外,不要轻视缺少Python 3支持的情况。核心Python社区中甚至没有人再关心Python 2。他们正在研究Python 3.4的下一个重要功能,这将是Python 3的第五个主要版本。PyPy家伙还没有一个。因此,在开始成为竞争者之前,他们还有一些工作要做。

不要误会我的意思。PyPy很棒。但是在许多非常重要的方面,它仍然远没有比CPython更好。

顺便说一句,如果您在PyPy中使用SymPy,则不会看到较小的内存占用(或加速)。参见https://bitbucket.org/pypy/pypy/issues/1447/

In addition to everything that’s been said here, PyPy is not nearly as rock solid as CPython in terms of bugs. With SymPy, we’ve found at about a dozen bugs in PyPy over the past couple of years, both in released versions and in the nightlies.

On the other hand, we’ve only ever found one bug in CPython, and that was in a prerelease.

Plus, don’t discount the lack of Python 3 support. No one in the core Python community even cares about Python 2 any more. They are working on the next big things in Python 3.4, which will be the fifth major release of Python 3. The PyPy guys still haven’t gotten one of them. So they’ve got some catching up to do before they can start to be contenders.

Don’t get me wrong. PyPy is awesome. But it’s still far from being better than CPython in a lot of very important ways.

And by the way, if you use SymPy in PyPy, you won’t see a smaller memory footprint (or a speedup either). See https://bitbucket.org/pypy/pypy/issues/1447/.