为什么Java虚拟机中没有GIL?为什么Python需要这么糟糕?

问题:为什么Java虚拟机中没有GIL?为什么Python需要这么糟糕?

我希望有人能够提供一些有关Java虚拟机的根本差异的见解,从而使Java虚拟机能够很好地实现线程而无需使用全局解释器锁(GIL),而Python则需要这样做。

I’m hoping someone can provide some insight as to what’s fundamentally different about the Java Virtual Machine that allows it to implement threads nicely without the need for a Global Interpreter Lock (GIL), while Python necessitates such an evil.


回答 0

Python(该语言)不需要GIL(这就是为什么它可以在JVM [Jython]和.NET [IronPython]上完美实现的原因,并且这些实现可以自由地使用多线程)。CPython(流行的实现)一直使用GIL来简化编码(尤其是垃圾收集机制的编码)和非线程安全的C编码库的集成(过去有很多这样的库); -)。

空载燕子项目,其它的宏伟目标中,做计划一个GIL -免费的虚拟机为Python -引用该网站,“此外,我们打算移除GIL和修复在Python多线程的状态,我们认为,这是可以通过实施更复杂的GC系统来实现,例如IBM的Recycler(Bacon等,2001)。”

Python (the language) doesn’t need a GIL (which is why it can perfectly be implemented on JVM [Jython] and .NET [IronPython], and those implementations multithread freely). CPython (the popular implementation) has always used a GIL for ease of coding (esp. the coding of the garbage collection mechanisms) and of integration of non-thread-safe C-coded libraries (there used to be a ton of those around;-).

The Unladen Swallow project, among other ambitious goals, does plan a GIL-free virtual machine for Python — to quote that site, “In addition, we intend to remove the GIL and fix the state of multithreading in Python. We believe this is possible through the implementation of a more sophisticated GC system, something like IBM’s Recycler (Bacon et al, 2001).”


回答 1

JVM(至少是热点)的确与“ GIL”具有类似的概念,它的锁定粒度要好得多,其中大部分来自更先进的GC热点。

在CPython中,这是一个很大的锁(可能并非如此,但对于参数而言已经足够好了),在JVM中,它的使用范围更广,涉及不同的概念。

例如,查看热点代码中的vm / runtime / safepoint.hpp,这实际上是一个障碍。一旦到达安全点,整个VM就Java代码而言都已停止,就像python VM在GIL处停止一样。

在Java世界中,此类VM暂停事件被称为“世界停止”,在这些时候,只有绑定到某些条件的本机代码可以自由运行,其余的VM已停止。

另外,由于Java中缺少粗略的锁,因此JNI的编写变得更加困难,因为JVM对其FFI调用的环境的保证较少,这是cpython相当容易的事情之一(尽管不像使用ctypes那样容易)。

The JVM (at least hotspot) does have a similar concept to the “GIL”, it’s just much finer in its lock granularity, most of this comes from the GC’s in hotspot which are more advanced.

In CPython it’s one big lock (probably not that true, but good enough for arguments sake), in the JVM it’s more spread about with different concepts depending on where it is used.

Take a look at, for example, vm/runtime/safepoint.hpp in the hotspot code, which is effectively a barrier. Once at a safepoint the entire VM has stopped with regard to java code, much like the python VM stops at the GIL.

In the Java world such VM pausing events are known as “stop-the-world”, at these points only native code that is bound to certain criteria is free running, the rest of the VM has been stopped.

Also the lack of a coarse lock in java makes JNI much more difficult to write, as the JVM makes less guarantees about its environment for FFI calls, one of the things that cpython makes fairly easy (although not as easy as using ctypes).


回答 2

以下是此博客文章http://www.grouplens.org/node/244中的注释,它暗示了为什么为IronPython或Jython取消GIL这么容易的原因,这是CPython使用引用计数,而另外2个VM具有垃圾回收器。

为什么我不知道这是为什么的确切机制,但这听起来似乎是合理的原因。

There is a comment down below in this blog post http://www.grouplens.org/node/244 that hints at the reason why it was so easy dispense with a GIL for IronPython or Jython, it is that CPython uses reference counting whereas the other 2 VMs have garbage collectors.

The exact mechanics of why this is so I don’t get, but it does sounds like a plausible reason.


回答 3

在此链接中,它们具有以下解释:

…“解释器的部分不是线程安全的,尽管主要是因为通过大量使用锁使它们全部成为线程安全,这会极大地减慢单线程()。这似乎与使用引用计数(JVM)的CPython垃圾收集器有关。而CLR不需要,因此不需要每次都锁定/释放引用计数。但是,即使有人想到了可接受的解决方案并实施了该解决方案,第三方库仍然会遇到同样的问题。”

In this link they have the following explanation:

… “Parts of the Interpreter aren’t threadsafe, though mostly because making them all threadsafe by massive lock usage would slow single-threaded extremely (source). This seems to be related to the CPython garbage collector using reference counting (the JVM and CLR don’t, and therefore don’t need to lock/release a reference count every time). But even if someone thought of an acceptable solution and implemented it, third party libraries would still have the same problems.”


回答 4

Python缺少jit / aot,并且它在多线程处理器上编写的时间框架不存在。另外,您可以重新编译缺少GIL的Julia lang中的所有内容,并提高Python代码的速度。而且Jython有点烂,它比Cpython和Java慢。如果您想使用Python并考虑使用并行插件,则不会立即提高速度,但是可以使用合适的插件进行并行编程。

Python lacks jit/aot and the time frame it was written at multithreaded processors didn’t exist. Alternatively you could recompile everything in Julia lang which lacks GIL and gain some speed boost on your Python code. Also Jython kind of sucks it’s slower than Cpython and Java. If you want to stick to Python consider using parallel plugins, you won’t gain an instant speed boost but you can do parallel programming with the right plugin.