Python字典中的线程安全

问题:Python字典中的线程安全

我有一堂课,有一本字典

class OrderBook:
    orders = {'Restaurant1': None,
              'Restaurant2': None,
              'Restaurant3': None,
              'Restaurant4': None}

    @staticmethod
    def addOrder(restaurant_name, orders):
        OrderBook.orders[restaurant_name] = orders

我正在运行4个线程(每个餐厅一个线程)来调用方法OrderBook.addOrder。这是每个线程运行的函数:

def addOrders(restaurant_name):

    #creates orders
    ...

    OrderBook.addOrder(restaurant_name, orders)

这样安全吗,还是在调用之前必须使用锁addOrder

I have a class which holds a dictionary

class OrderBook:
    orders = {'Restaurant1': None,
              'Restaurant2': None,
              'Restaurant3': None,
              'Restaurant4': None}

    @staticmethod
    def addOrder(restaurant_name, orders):
        OrderBook.orders[restaurant_name] = orders

And I am running 4 threads (one for each restaurant) that call the method OrderBook.addOrder. Here is the function ran by each thread:

def addOrders(restaurant_name):

    #creates orders
    ...

    OrderBook.addOrder(restaurant_name, orders)

Is this safe, or do I have to use a lock before calling addOrder?


回答 0

Python的内置结构对于单个操作是线程安全的,但是有时很难看到一条语句真正变成了多个操作。

您的代码应该是安全的。请记住:这里的锁几乎不会增加任何开销,并且让您高枕无忧。

http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm 具有更多详细信息。

Python’s built-in structures are thread-safe for single operations, but it can sometimes be hard to see where a statement really becomes multiple operations.

Your code should be safe. Keep in mind: a lock here will add almost no overhead, and will give you peace of mind.

http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm has more details.


回答 1

是的,内置类型本质上是线程安全的:http : //docs.python.org/glossary.html#term-global-interpreter-lock

通过使对象模型(包括关键的内置类型,如dict)隐式地安全地防止并发访问,从而简化了CPython的实现。

Yes, built-in types are inherently thread-safe: http://docs.python.org/glossary.html#term-global-interpreter-lock

This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access.


回答 2

Google的风格指南建议不要依赖dict原子性

在以下位置进一步详细解释:Python变量赋值是原子的吗?

不要依赖内置类型的原子性。

尽管Python的内置数据类型(如字典)似乎具有原子操作,但在某些极端情况下,它们不是原子操作(例如,如果将__hash____eq__实现为Python方法),则不应依赖其原子性。您也不应该依赖于原子变量赋值(因为这又取决于字典)。

使用Queue模块的Queue数据类型作为在线程之间传递数据的首选方式。否则,请使用线程模块及其锁定原语。了解如何正确使用条件变量,以便可以使用threading.Condition而不是使用较低级别的锁。

我同意这一观点:CPython中已经存在GIL,因此使用Lock的性能影响可以忽略不计。当这些CPython实现细节一天之内改变时,花在复杂代码库中的错误查找所花费的时间将大大增加。

Google’s style guide advises against relying on dict atomicity

Explained in further detail at: Is Python variable assignment atomic?

Do not rely on the atomicity of built-in types.

While Python’s built-in data types such as dictionaries appear to have atomic operations, there are corner cases where they aren’t atomic (e.g. if __hash__ or __eq__ are implemented as Python methods) and their atomicity should not be relied upon. Neither should you rely on atomic variable assignment (since this in turn depends on dictionaries).

Use the Queue module’s Queue data type as the preferred way to communicate data between threads. Otherwise, use the threading module and its locking primitives. Learn about the proper use of condition variables so you can use threading.Condition instead of using lower-level locks.

And I agree with this one: there is already the GIL in CPython, so the performance hit of using a Lock will be negligible. Much more costly will be the hours spent bug hunting in a complex codebase when those CPython implementation details change one day.