标签归档:dictionary

Python字典中的线程安全

问题:Python字典中的线程安全

我有一堂课,有一本字典

class OrderBook:
    orders = {'Restaurant1': None,
              'Restaurant2': None,
              'Restaurant3': None,
              'Restaurant4': None}

    @staticmethod
    def addOrder(restaurant_name, orders):
        OrderBook.orders[restaurant_name] = orders

我正在运行4个线程(每个餐厅一个线程)来调用方法OrderBook.addOrder。这是每个线程运行的函数:

def addOrders(restaurant_name):

    #creates orders
    ...

    OrderBook.addOrder(restaurant_name, orders)

这样安全吗,还是在调用之前必须使用锁addOrder

I have a class which holds a dictionary

class OrderBook:
    orders = {'Restaurant1': None,
              'Restaurant2': None,
              'Restaurant3': None,
              'Restaurant4': None}

    @staticmethod
    def addOrder(restaurant_name, orders):
        OrderBook.orders[restaurant_name] = orders

And I am running 4 threads (one for each restaurant) that call the method OrderBook.addOrder. Here is the function ran by each thread:

def addOrders(restaurant_name):

    #creates orders
    ...

    OrderBook.addOrder(restaurant_name, orders)

Is this safe, or do I have to use a lock before calling addOrder?


回答 0

Python的内置结构对于单个操作是线程安全的,但是有时很难看到一条语句真正变成了多个操作。

您的代码应该是安全的。请记住:这里的锁几乎不会增加任何开销,并且让您高枕无忧。

http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm 具有更多详细信息。

Python’s built-in structures are thread-safe for single operations, but it can sometimes be hard to see where a statement really becomes multiple operations.

Your code should be safe. Keep in mind: a lock here will add almost no overhead, and will give you peace of mind.

http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm has more details.


回答 1

是的,内置类型本质上是线程安全的:http : //docs.python.org/glossary.html#term-global-interpreter-lock

通过使对象模型(包括关键的内置类型,如dict)隐式地安全地防止并发访问,从而简化了CPython的实现。

Yes, built-in types are inherently thread-safe: http://docs.python.org/glossary.html#term-global-interpreter-lock

This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access.


回答 2

Google的风格指南建议不要依赖dict原子性

在以下位置进一步详细解释:Python变量赋值是原子的吗?

不要依赖内置类型的原子性。

尽管Python的内置数据类型(如字典)似乎具有原子操作,但在某些极端情况下,它们不是原子操作(例如,如果将__hash____eq__实现为Python方法),则不应依赖其原子性。您也不应该依赖于原子变量赋值(因为这又取决于字典)。

使用Queue模块的Queue数据类型作为在线程之间传递数据的首选方式。否则,请使用线程模块及其锁定原语。了解如何正确使用条件变量,以便可以使用threading.Condition而不是使用较低级别的锁。

我同意这一观点:CPython中已经存在GIL,因此使用Lock的性能影响可以忽略不计。当这些CPython实现细节一天之内改变时,花在复杂代码库中的错误查找所花费的时间将大大增加。

Google’s style guide advises against relying on dict atomicity

Explained in further detail at: Is Python variable assignment atomic?

Do not rely on the atomicity of built-in types.

While Python’s built-in data types such as dictionaries appear to have atomic operations, there are corner cases where they aren’t atomic (e.g. if __hash__ or __eq__ are implemented as Python methods) and their atomicity should not be relied upon. Neither should you rely on atomic variable assignment (since this in turn depends on dictionaries).

Use the Queue module’s Queue data type as the preferred way to communicate data between threads. Otherwise, use the threading module and its locking primitives. Learn about the proper use of condition variables so you can use threading.Condition instead of using lower-level locks.

And I agree with this one: there is already the GIL in CPython, so the performance hit of using a Lock will be negligible. Much more costly will be the hours spent bug hunting in a complex codebase when those CPython implementation details change one day.


在python中从字典设置属性

问题:在python中从字典设置属性

是否可以通过python中的字典创建对象,使得每个键都是该对象的属性?

像这样:

 d = { 'name': 'Oscar', 'lastName': 'Reyes', 'age':32 }

 e = Employee(d) 
 print e.name # Oscar 
 print e.age + 10 # 42 

我认为这与该问题几乎完全相反:来自对象字段的Python字典

Is it possible to create an object from a dictionary in python in such a way that each key is an attribute of that object?

Something like this:

 d = { 'name': 'Oscar', 'lastName': 'Reyes', 'age':32 }

 e = Employee(d) 
 print e.name # Oscar 
 print e.age + 10 # 42 

I think it would be pretty much the inverse of this question: Python dictionary from an object’s fields


回答 0

当然,是这样的:

class Employee(object):
    def __init__(self, initial_data):
        for key in initial_data:
            setattr(self, key, initial_data[key])

更新资料

正如布伦特·纳什(Brent Nash)所建议的,您还可以通过允许使用关键字参数来使其更加灵活:

class Employee(object):
    def __init__(self, *initial_data, **kwargs):
        for dictionary in initial_data:
            for key in dictionary:
                setattr(self, key, dictionary[key])
        for key in kwargs:
            setattr(self, key, kwargs[key])

然后您可以这样称呼它:

e = Employee({"name": "abc", "age": 32})

或像这样:

e = Employee(name="abc", age=32)

甚至像这样:

employee_template = {"role": "minion"}
e = Employee(employee_template, name="abc", age=32)

Sure, something like this:

class Employee(object):
    def __init__(self, initial_data):
        for key in initial_data:
            setattr(self, key, initial_data[key])

Update

As Brent Nash suggests, you can make this more flexible by allowing keyword arguments as well:

class Employee(object):
    def __init__(self, *initial_data, **kwargs):
        for dictionary in initial_data:
            for key in dictionary:
                setattr(self, key, dictionary[key])
        for key in kwargs:
            setattr(self, key, kwargs[key])

Then you can call it like this:

e = Employee({"name": "abc", "age": 32})

or like this:

e = Employee(name="abc", age=32)

or even like this:

employee_template = {"role": "minion"}
e = Employee(employee_template, name="abc", age=32)

回答 1

以这种方式设置属性几乎肯定不是解决问题的最佳方法。要么:

  1. 您知道所有字段都应该提前。在这种情况下,您可以显式设置所有属性。这看起来像

    class Employee(object):
        def __init__(self, name, last_name, age):
            self.name = name
            self.last_name = last_name
            self.age = age
    
    d = {'name': 'Oscar', 'last_name': 'Reyes', 'age':32 }
    e = Employee(**d) 
    
    print e.name # Oscar 
    print e.age + 10 # 42 

    要么

  2. 您不知道所有字段都应该提前。在这种情况下,您应该将数据存储为dict,而不是污染对象命名空间。这些属性用于静态访问。这种情况看起来像

    class Employee(object):
        def __init__(self, data):
            self.data = data
    
    d = {'name': 'Oscar', 'last_name': 'Reyes', 'age':32 }
    e = Employee(d) 
    
    print e.data['name'] # Oscar 
    print e.data['age'] + 10 # 42 

与情况1基本等效的另一种解决方案是使用collections.namedtuple。有关如何实现的信息,请参见van的答案。

Setting attributes in this way is almost certainly not the best way to solve a problem. Either:

  1. You know what all the fields should be ahead of time. In that case, you can set all the attributes explicitly. This would look like

    class Employee(object):
        def __init__(self, name, last_name, age):
            self.name = name
            self.last_name = last_name
            self.age = age
    
    d = {'name': 'Oscar', 'last_name': 'Reyes', 'age':32 }
    e = Employee(**d) 
    
    print e.name # Oscar 
    print e.age + 10 # 42 
    

    or

  2. You don’t know what all the fields should be ahead of time. In this case, you should store the data as a dict instead of polluting an objects namespace. Attributes are for static access. This case would look like

    class Employee(object):
        def __init__(self, data):
            self.data = data
    
    d = {'name': 'Oscar', 'last_name': 'Reyes', 'age':32 }
    e = Employee(d) 
    
    print e.data['name'] # Oscar 
    print e.data['age'] + 10 # 42 
    

Another solution that is basically equivalent to case 1 is to use a collections.namedtuple. See van’s answer for how to implement that.


回答 2

您可以使用访问对象的属性__dict__,并对其调用update方法:

>>> class Employee(object):
...     def __init__(self, _dict):
...         self.__dict__.update(_dict)
... 


>>> dict = { 'name': 'Oscar', 'lastName': 'Reyes', 'age':32 }

>>> e = Employee(dict)

>>> e.name
'Oscar'

>>> e.age
32

You can access the attributes of an object with __dict__, and call the update method on it:

>>> class Employee(object):
...     def __init__(self, _dict):
...         self.__dict__.update(_dict)
... 


>>> dict = { 'name': 'Oscar', 'lastName': 'Reyes', 'age':32 }

>>> e = Employee(dict)

>>> e.name
'Oscar'

>>> e.age
32

回答 3

为什么不只使用属性名称作为字典的键?

class StructMyDict(dict):

     def __getattr__(self, name):
         try:
             return self[name]
         except KeyError as e:
             raise AttributeError(e)

     def __setattr__(self, name, value):
         self[name] = value

您可以使用命名参数,元组列表,字典或单独的属性分配进行初始化,例如:

nautical = StructMyDict(left = "Port", right = "Starboard") # named args

nautical2 = StructMyDict({"left":"Port","right":"Starboard"}) # dictionary

nautical3 = StructMyDict([("left","Port"),("right","Starboard")]) # tuples list

nautical4 = StructMyDict()  # fields TBD
nautical4.left = "Port"
nautical4.right = "Starboard"

for x in [nautical, nautical2, nautical3, nautical4]:
    print "%s <--> %s" % (x.left,x.right)

或者,您可以为未知值返回None,而不是引发属性错误。(web2py存储类中使用的一个技巧)

Why not just use attribute names as keys to a dictionary?

class StructMyDict(dict):

     def __getattr__(self, name):
         try:
             return self[name]
         except KeyError as e:
             raise AttributeError(e)

     def __setattr__(self, name, value):
         self[name] = value

You can initialize with named arguments, a list of tuples, or a dictionary, or individual attribute assignments, e.g.:

nautical = StructMyDict(left = "Port", right = "Starboard") # named args

nautical2 = StructMyDict({"left":"Port","right":"Starboard"}) # dictionary

nautical3 = StructMyDict([("left","Port"),("right","Starboard")]) # tuples list

nautical4 = StructMyDict()  # fields TBD
nautical4.left = "Port"
nautical4.right = "Starboard"

for x in [nautical, nautical2, nautical3, nautical4]:
    print "%s <--> %s" % (x.left,x.right)

Alternatively, instead of raising the attribute error, you can return None for unknown values. (A trick used in the web2py storage class)


回答 4

我认为,settattr如果您确实需要支持,那么使用答案是可行的方法dict

但是,如果Employeeobject只是可以使用点语法(.name)而不是dict语法(['name'])访问的结构,则可以使用namedtuple,如下所示:

from collections import namedtuple

Employee = namedtuple('Employee', 'name age')
e = Employee('noname01', 6)
print e
#>> Employee(name='noname01', age=6)

# create Employee from dictionary
d = {'name': 'noname02', 'age': 7}
e = Employee(**d)
print e
#>> Employee(name='noname02', age=7)
print e._asdict()
#>> {'age': 7, 'name': 'noname02'}

您确实具有_asdict()将所有属性作为字典访问的方法,但是以后只能在构造过程中才能添加其他属性。

I think that answer using settattr are the way to go if you really need to support dict.

But if Employee object is just a structure which you can access with dot syntax (.name) instead of dict syntax (['name']), you can use namedtuple like this:

from collections import namedtuple

Employee = namedtuple('Employee', 'name age')
e = Employee('noname01', 6)
print e
#>> Employee(name='noname01', age=6)

# create Employee from dictionary
d = {'name': 'noname02', 'age': 7}
e = Employee(**d)
print e
#>> Employee(name='noname02', age=7)
print e._asdict()
#>> {'age': 7, 'name': 'noname02'}

You do have _asdict() method to access all properties as dictionary, but you cannot add additional attributes later, only during the construction.


回答 5

例如说

class A():
    def __init__(self):
        self.x=7
        self.y=8
        self.z="name"

如果您想一次设置属性

d = {'x':100,'y':300,'z':"blah"}
a = A()
a.__dict__.update(d)

say for example

class A():
    def __init__(self):
        self.x=7
        self.y=8
        self.z="name"

if you want to set the attributes at once

d = {'x':100,'y':300,'z':"blah"}
a = A()
a.__dict__.update(d)

回答 6

与使用dict类似,您可以像这样使用kwargs:

class Person:
   def __init__(self, **kwargs):
       self.properties = kwargs

   def get_property(self, key):
       return self.properties.get(key, None)

   def main():
       timmy = Person(color = 'red')
       print(timmy.get_property('color')) #prints 'red'

similar to using a dict, you could just use kwargs like so:

class Person:
   def __init__(self, **kwargs):
       self.properties = kwargs

   def get_property(self, key):
       return self.properties.get(key, None)

   def main():
       timmy = Person(color = 'red')
       print(timmy.get_property('color')) #prints 'red'

在Python中进行逆字典查找

问题:在Python中进行逆字典查找

通过了解字典中的值,是否有任何简单的方法来找到密钥?

我只能想到的是:

key = [key for key, value in dict_obj.items() if value == 'value'][0]

Is there any straightforward way of finding a key by knowing the value within a dictionary?

All I can think of is this:

key = [key for key, value in dict_obj.items() if value == 'value'][0]

回答 0

空无一人。不要忘记,可以在任意数量的键上找到该值,包括0或大于1。

There is none. Don’t forget that the value may be found on any number of keys, including 0 or more than 1.


回答 1

您的列表理解力会遍历所有dict项,找到所有匹配项,然后仅返回第一个键。该生成器表达式将仅迭代到返回第一个值所需的程度:

key = next(key for key, value in dd.items() if value == 'value')

dd字典在哪里。StopIteration如果未找到匹配项,则会加注,因此您可能想捕获该匹配项并返回更合适的异常,例如ValueErrorKeyError

Your list comprehension goes through all the dict’s items finding all the matches, then just returns the first key. This generator expression will only iterate as far as necessary to return the first value:

key = next(key for key, value in dd.items() if value == 'value')

where dd is the dict. Will raise StopIteration if no match is found, so you might want to catch that and return a more appropriate exception like ValueError or KeyError.


回答 2

在某些情况下,字典是一对一映射

例如,

d = {1: "one", 2: "two" ...}

如果仅执行一次查找,则方法是可以的。但是,如果您需要进行多个查找,则创建逆字典的效率会更高。

ivd = {v: k for k, v in d.items()}

如果可能有多个具有相同值的键,则在这种情况下,您需要指定所需的行为。

如果您的Python是2.6或更早的版本,则可以使用

ivd = dict((v, k) for k, v in d.items())

There are cases where a dictionary is a one:one mapping

Eg,

d = {1: "one", 2: "two" ...}

Your approach is ok if you are only doing a single lookup. However if you need to do more than one lookup it will be more efficient to create an inverse dictionary

ivd = {v: k for k, v in d.items()}

If there is a possibility of multiple keys with the same value, you will need to specify the desired behaviour in this case.

If your Python is 2.6 or older, you can use

ivd = dict((v, k) for k, v in d.items())

回答 3

这个版本比您的版本短26%,但功能相同,即使对于冗余/模棱两可的值(也将返回您的第一个匹配项)。但是,它的速度可能是您的速度的两倍,因为它会两次根据字典创建一个列表。

key = dict_obj.keys()[dict_obj.values().index(value)]

或者,如果您更喜欢简洁而不是可读性,则可以使用

key = list(dict_obj)[dict_obj.values().index(value)]

而且,如果您更喜欢效率,@ PaulMcGuire的方法会更好。如果有许多共享相同值的键,则不使用列表理解实例化该键列表,而使用生成器会更有效:

key = (key for key, value in dict_obj.items() if value == 'value').next()

This version is 26% shorter than yours but functions identically, even for redundant/ambiguous values (returns the first match, as yours does). However, it is probably twice as slow as yours, because it creates a list from the dict twice.

key = dict_obj.keys()[dict_obj.values().index(value)]

Or if you prefer brevity over readability you can save one more character with

key = list(dict_obj)[dict_obj.values().index(value)]

And if you prefer efficiency, @PaulMcGuire’s approach is better. If there are lots of keys that share the same value it’s more efficient not to instantiate that list of keys with a list comprehension and instead use use a generator:

key = (key for key, value in dict_obj.items() if value == 'value').next()

回答 4

由于这仍然非常相关,因此我第一次花了点时间解决这个问题,我将发布我的(在Python 3中工作)解决方案:

testdict = {'one'   : '1',
            'two'   : '2',
            'three' : '3',
            'four'  : '4'
            }

value = '2'

[key for key in testdict.items() if key[1] == value][0][0]

Out[1]: 'two'

它将为您提供第一个匹配的值。

Since this is still very relevant, the first Google hit and I just spend some time figuring this out, I’ll post my (working in Python 3) solution:

testdict = {'one'   : '1',
            'two'   : '2',
            'three' : '3',
            'four'  : '4'
            }

value = '2'

[key for key in testdict.items() if key[1] == value][0][0]

Out[1]: 'two'

It will give you the first value that matches.


回答 5

也许像字典这样的类(如DoubleDict下面的类)是您想要的?您可以结合使用任何提供的元类,DoubleDict也可以完全避免使用任何元类。

import functools
import threading

################################################################################

class _DDChecker(type):

    def __new__(cls, name, bases, classdict):
        for key, value in classdict.items():
            if key not in {'__new__', '__slots__', '_DoubleDict__dict_view'}:
                classdict[key] = cls._wrap(value)
        return super().__new__(cls, name, bases, classdict)

    @staticmethod
    def _wrap(function):
        @functools.wraps(function)
        def check(self, *args, **kwargs):
            value = function(self, *args, **kwargs)
            if self._DoubleDict__forward != \
               dict(map(reversed, self._DoubleDict__reverse.items())):
                raise RuntimeError('Forward & Reverse are not equivalent!')
            return value
        return check

################################################################################

class _DDAtomic(_DDChecker):

    def __new__(cls, name, bases, classdict):
        if not bases:
            classdict['__slots__'] += ('_DDAtomic__mutex',)
            classdict['__new__'] = cls._atomic_new
        return super().__new__(cls, name, bases, classdict)

    @staticmethod
    def _atomic_new(cls, iterable=(), **pairs):
        instance = object.__new__(cls, iterable, **pairs)
        instance.__mutex = threading.RLock()
        instance.clear()
        return instance

    @staticmethod
    def _wrap(function):
        @functools.wraps(function)
        def atomic(self, *args, **kwargs):
            with self.__mutex:
                return function(self, *args, **kwargs)
        return atomic

################################################################################

class _DDAtomicChecker(_DDAtomic):

    @staticmethod
    def _wrap(function):
        return _DDAtomic._wrap(_DDChecker._wrap(function))

################################################################################

class DoubleDict(metaclass=_DDAtomicChecker):

    __slots__ = '__forward', '__reverse'

    def __new__(cls, iterable=(), **pairs):
        instance = super().__new__(cls, iterable, **pairs)
        instance.clear()
        return instance

    def __init__(self, iterable=(), **pairs):
        self.update(iterable, **pairs)

    ########################################################################

    def __repr__(self):
        return repr(self.__forward)

    def __lt__(self, other):
        return self.__forward < other

    def __le__(self, other):
        return self.__forward <= other

    def __eq__(self, other):
        return self.__forward == other

    def __ne__(self, other):
        return self.__forward != other

    def __gt__(self, other):
        return self.__forward > other

    def __ge__(self, other):
        return self.__forward >= other

    def __len__(self):
        return len(self.__forward)

    def __getitem__(self, key):
        if key in self:
            return self.__forward[key]
        return self.__missing_key(key)

    def __setitem__(self, key, value):
        if self.in_values(value):
            del self[self.get_key(value)]
        self.__set_key_value(key, value)
        return value

    def __delitem__(self, key):
        self.pop(key)

    def __iter__(self):
        return iter(self.__forward)

    def __contains__(self, key):
        return key in self.__forward

    ########################################################################

    def clear(self):
        self.__forward = {}
        self.__reverse = {}

    def copy(self):
        return self.__class__(self.items())

    def del_value(self, value):
        self.pop_key(value)

    def get(self, key, default=None):
        return self[key] if key in self else default

    def get_key(self, value):
        if self.in_values(value):
            return self.__reverse[value]
        return self.__missing_value(value)

    def get_key_default(self, value, default=None):
        return self.get_key(value) if self.in_values(value) else default

    def in_values(self, value):
        return value in self.__reverse

    def items(self):
        return self.__dict_view('items', ((key, self[key]) for key in self))

    def iter_values(self):
        return iter(self.__reverse)

    def keys(self):
        return self.__dict_view('keys', self.__forward)

    def pop(self, key, *default):
        if len(default) > 1:
            raise TypeError('too many arguments')
        if key in self:
            value = self[key]
            self.__del_key_value(key, value)
            return value
        if default:
            return default[0]
        raise KeyError(key)

    def pop_key(self, value, *default):
        if len(default) > 1:
            raise TypeError('too many arguments')
        if self.in_values(value):
            key = self.get_key(value)
            self.__del_key_value(key, value)
            return key
        if default:
            return default[0]
        raise KeyError(value)

    def popitem(self):
        try:
            key = next(iter(self))
        except StopIteration:
            raise KeyError('popitem(): dictionary is empty')
        return key, self.pop(key)

    def set_key(self, value, key):
        if key in self:
            self.del_value(self[key])
        self.__set_key_value(key, value)
        return key

    def setdefault(self, key, default=None):
        if key not in self:
            self[key] = default
        return self[key]

    def setdefault_key(self, value, default=None):
        if not self.in_values(value):
            self.set_key(value, default)
        return self.get_key(value)

    def update(self, iterable=(), **pairs):
        for key, value in (((key, iterable[key]) for key in iterable.keys())
                           if hasattr(iterable, 'keys') else iterable):
            self[key] = value
        for key, value in pairs.items():
            self[key] = value

    def values(self):
        return self.__dict_view('values', self.__reverse)

    ########################################################################

    def __missing_key(self, key):
        if hasattr(self.__class__, '__missing__'):
            return self.__missing__(key)
        if not hasattr(self, 'default_factory') \
           or self.default_factory is None:
            raise KeyError(key)
        return self.__setitem__(key, self.default_factory())

    def __missing_value(self, value):
        if hasattr(self.__class__, '__missing_value__'):
            return self.__missing_value__(value)
        if not hasattr(self, 'default_key_factory') \
           or self.default_key_factory is None:
            raise KeyError(value)
        return self.set_key(value, self.default_key_factory())

    def __set_key_value(self, key, value):
        self.__forward[key] = value
        self.__reverse[value] = key

    def __del_key_value(self, key, value):
        del self.__forward[key]
        del self.__reverse[value]

    ########################################################################

    class __dict_view(frozenset):

        __slots__ = '__name'

        def __new__(cls, name, iterable=()):
            instance = super().__new__(cls, iterable)
            instance.__name = name
            return instance

        def __repr__(self):
            return 'dict_{}({})'.format(self.__name, list(self))

Maybe a dictionary-like class such as DoubleDict down below is what you want? You can use any one of the provided metaclasses in conjuction with DoubleDict or may avoid using any metaclass at all.

import functools
import threading

################################################################################

class _DDChecker(type):

    def __new__(cls, name, bases, classdict):
        for key, value in classdict.items():
            if key not in {'__new__', '__slots__', '_DoubleDict__dict_view'}:
                classdict[key] = cls._wrap(value)
        return super().__new__(cls, name, bases, classdict)

    @staticmethod
    def _wrap(function):
        @functools.wraps(function)
        def check(self, *args, **kwargs):
            value = function(self, *args, **kwargs)
            if self._DoubleDict__forward != \
               dict(map(reversed, self._DoubleDict__reverse.items())):
                raise RuntimeError('Forward & Reverse are not equivalent!')
            return value
        return check

################################################################################

class _DDAtomic(_DDChecker):

    def __new__(cls, name, bases, classdict):
        if not bases:
            classdict['__slots__'] += ('_DDAtomic__mutex',)
            classdict['__new__'] = cls._atomic_new
        return super().__new__(cls, name, bases, classdict)

    @staticmethod
    def _atomic_new(cls, iterable=(), **pairs):
        instance = object.__new__(cls, iterable, **pairs)
        instance.__mutex = threading.RLock()
        instance.clear()
        return instance

    @staticmethod
    def _wrap(function):
        @functools.wraps(function)
        def atomic(self, *args, **kwargs):
            with self.__mutex:
                return function(self, *args, **kwargs)
        return atomic

################################################################################

class _DDAtomicChecker(_DDAtomic):

    @staticmethod
    def _wrap(function):
        return _DDAtomic._wrap(_DDChecker._wrap(function))

################################################################################

class DoubleDict(metaclass=_DDAtomicChecker):

    __slots__ = '__forward', '__reverse'

    def __new__(cls, iterable=(), **pairs):
        instance = super().__new__(cls, iterable, **pairs)
        instance.clear()
        return instance

    def __init__(self, iterable=(), **pairs):
        self.update(iterable, **pairs)

    ########################################################################

    def __repr__(self):
        return repr(self.__forward)

    def __lt__(self, other):
        return self.__forward < other

    def __le__(self, other):
        return self.__forward <= other

    def __eq__(self, other):
        return self.__forward == other

    def __ne__(self, other):
        return self.__forward != other

    def __gt__(self, other):
        return self.__forward > other

    def __ge__(self, other):
        return self.__forward >= other

    def __len__(self):
        return len(self.__forward)

    def __getitem__(self, key):
        if key in self:
            return self.__forward[key]
        return self.__missing_key(key)

    def __setitem__(self, key, value):
        if self.in_values(value):
            del self[self.get_key(value)]
        self.__set_key_value(key, value)
        return value

    def __delitem__(self, key):
        self.pop(key)

    def __iter__(self):
        return iter(self.__forward)

    def __contains__(self, key):
        return key in self.__forward

    ########################################################################

    def clear(self):
        self.__forward = {}
        self.__reverse = {}

    def copy(self):
        return self.__class__(self.items())

    def del_value(self, value):
        self.pop_key(value)

    def get(self, key, default=None):
        return self[key] if key in self else default

    def get_key(self, value):
        if self.in_values(value):
            return self.__reverse[value]
        return self.__missing_value(value)

    def get_key_default(self, value, default=None):
        return self.get_key(value) if self.in_values(value) else default

    def in_values(self, value):
        return value in self.__reverse

    def items(self):
        return self.__dict_view('items', ((key, self[key]) for key in self))

    def iter_values(self):
        return iter(self.__reverse)

    def keys(self):
        return self.__dict_view('keys', self.__forward)

    def pop(self, key, *default):
        if len(default) > 1:
            raise TypeError('too many arguments')
        if key in self:
            value = self[key]
            self.__del_key_value(key, value)
            return value
        if default:
            return default[0]
        raise KeyError(key)

    def pop_key(self, value, *default):
        if len(default) > 1:
            raise TypeError('too many arguments')
        if self.in_values(value):
            key = self.get_key(value)
            self.__del_key_value(key, value)
            return key
        if default:
            return default[0]
        raise KeyError(value)

    def popitem(self):
        try:
            key = next(iter(self))
        except StopIteration:
            raise KeyError('popitem(): dictionary is empty')
        return key, self.pop(key)

    def set_key(self, value, key):
        if key in self:
            self.del_value(self[key])
        self.__set_key_value(key, value)
        return key

    def setdefault(self, key, default=None):
        if key not in self:
            self[key] = default
        return self[key]

    def setdefault_key(self, value, default=None):
        if not self.in_values(value):
            self.set_key(value, default)
        return self.get_key(value)

    def update(self, iterable=(), **pairs):
        for key, value in (((key, iterable[key]) for key in iterable.keys())
                           if hasattr(iterable, 'keys') else iterable):
            self[key] = value
        for key, value in pairs.items():
            self[key] = value

    def values(self):
        return self.__dict_view('values', self.__reverse)

    ########################################################################

    def __missing_key(self, key):
        if hasattr(self.__class__, '__missing__'):
            return self.__missing__(key)
        if not hasattr(self, 'default_factory') \
           or self.default_factory is None:
            raise KeyError(key)
        return self.__setitem__(key, self.default_factory())

    def __missing_value(self, value):
        if hasattr(self.__class__, '__missing_value__'):
            return self.__missing_value__(value)
        if not hasattr(self, 'default_key_factory') \
           or self.default_key_factory is None:
            raise KeyError(value)
        return self.set_key(value, self.default_key_factory())

    def __set_key_value(self, key, value):
        self.__forward[key] = value
        self.__reverse[value] = key

    def __del_key_value(self, key, value):
        del self.__forward[key]
        del self.__reverse[value]

    ########################################################################

    class __dict_view(frozenset):

        __slots__ = '__name'

        def __new__(cls, name, iterable=()):
            instance = super().__new__(cls, iterable)
            instance.__name = name
            return instance

        def __repr__(self):
            return 'dict_{}({})'.format(self.__name, list(self))

回答 6

不,如果不查看所有键并检查所有键的值,就无法有效地做到这一点。因此,您将需要O(n)时间来执行此操作。如果您需要进行大量此类查找,则需要构造一个反向字典(也可以在中进行O(n)),然后在此反向字典中进行搜索(每个搜索平均需要进行一次),从而有效地做到这一点O(1)

这是一个如何从普通字典构造反向字典(将能够进行一对多映射)的示例:

for i in h_normal:
    for j in h_normal[i]:
        if j not in h_reversed:
            h_reversed[j] = set([i])
        else:
            h_reversed[j].add(i)

例如,如果您的

h_normal = {
  1: set([3]), 
  2: set([5, 7]), 
  3: set([]), 
  4: set([7]), 
  5: set([1, 4]), 
  6: set([1, 7]), 
  7: set([1]), 
  8: set([2, 5, 6])
}

h_reversed会的

{
  1: set([5, 6, 7]),
  2: set([8]), 
  3: set([1]), 
  4: set([5]), 
  5: set([8, 2]), 
  6: set([8]), 
  7: set([2, 4, 6])
}

No, you can not do this efficiently without looking in all the keys and checking all their values. So you will need O(n) time to do this. If you need to do a lot of such lookups you will need to do this efficiently by constructing a reversed dictionary (can be done also in O(n)) and then making a search inside of this reversed dictionary (each search will take on average O(1)).

Here is an example of how to construct a reversed dictionary (which will be able to do one to many mapping) from a normal dictionary:

for i in h_normal:
    for j in h_normal[i]:
        if j not in h_reversed:
            h_reversed[j] = set([i])
        else:
            h_reversed[j].add(i)

For example if your

h_normal = {
  1: set([3]), 
  2: set([5, 7]), 
  3: set([]), 
  4: set([7]), 
  5: set([1, 4]), 
  6: set([1, 7]), 
  7: set([1]), 
  8: set([2, 5, 6])
}

your h_reversed will be

{
  1: set([5, 6, 7]),
  2: set([8]), 
  3: set([1]), 
  4: set([5]), 
  5: set([8, 2]), 
  6: set([8]), 
  7: set([2, 4, 6])
}

回答 7

据我所知,还没有一种方法,但是要做的一种方法是创建一个用于按键正常查找的字典,另一个用于按值反向查找的字典。

这里有一个这样的实现示例:

http://code.activestate.com/recipes/415903-two-dict-classes-which-can-lookup-keys-by-value-an/

这确实意味着在键中查找一个值可能会导致多个结果,这些结果可以作为简单列表返回。

There isn’t one as far as I know of, one way however to do it is to create a dict for normal lookup by key and another dict for reverse lookup by value.

There’s an example of such an implementation here:

http://code.activestate.com/recipes/415903-two-dict-classes-which-can-lookup-keys-by-value-an/

This does mean that looking up the keys for a value could result in multiple results which can be returned as a simple list.


回答 8

我知道这可能被认为是“浪费的”,但是在这种情况下,我经常将键作为附加列存储在值记录中:

d = {'key1' : ('key1', val, val...), 'key2' : ('key2', val, val...) }

这是一个折衷方案,让人感觉错了,但是它很简单而且有效,当然取决于值是元组而不是简单的值。

I know this might be considered ‘wasteful’, but in this scenario I often store the key as an additional column in the value record:

d = {'key1' : ('key1', val, val...), 'key2' : ('key2', val, val...) }

it’s a tradeoff and feels wrong, but it’s simple and works and of course depends on values being tuples rather than simple values.


回答 9

制作反向字典

reverse_dictionary = {v:k for k,v in dictionary.items()} 

如果您要进行很多反向查找

Make a reverse dictionary

reverse_dictionary = {v:k for k,v in dictionary.items()} 

If you have a lot of reverse lookups to do


回答 10

通过字典中的值可以是任何类型的对象,它们不能以其他方式进行哈希或索引。因此,对于该集合类型,通过值查找键是不自然的。任何类似的查询只能在O(n)时间内执行。因此,如果这是一项经常性的任务,则应该查看一些像Jon sujjested这样的键的索引,或者甚至是一些空间索引(DB或http://pypi.python.org/pypi/Rtree/)。

Through values in dictionary can be object of any kind they can’t be hashed or indexed other way. So finding key by the value is unnatural for this collection type. Any query like that can be executed in O(n) time only. So if this is frequent task you should take a look for some indexing of key like Jon sujjested or maybe even some spatial index (DB or http://pypi.python.org/pypi/Rtree/ ).


回答 11

我将字典用作一种“数据库”,因此我需要找到一个可以重用的密钥。就我而言,如果键的值为None,则可以将其获取并重用,而不必“分配”另一个ID。只是想我会分享。

db = {0:[], 1:[], ..., 5:None, 11:None, 19:[], ...}

keys_to_reallocate = [None]
allocate.extend(i for i in db.iterkeys() if db[i] is None)
free_id = keys_to_reallocate[-1]

我喜欢这个,因为我不必尝试捕获任何错误,例如StopIterationIndexError。如果有可用的密钥,free_id则将包含一个。如果没有,那么它将简单地为None。可能不是pythonic,但我真的不想在try这里使用…

I’m using dictionaries as a sort of “database”, so I need to find a key that I can reuse. For my case, if a key’s value is None, then I can take it and reuse it without having to “allocate” another id. Just figured I’d share it.

db = {0:[], 1:[], ..., 5:None, 11:None, 19:[], ...}

keys_to_reallocate = [None]
allocate.extend(i for i in db.iterkeys() if db[i] is None)
free_id = keys_to_reallocate[-1]

I like this one because I don’t have to try and catch any errors such as StopIteration or IndexError. If there’s a key available, then free_id will contain one. If there isn’t, then it will simply be None. Probably not pythonic, but I really didn’t want to use a try here…


字典中键的顺序

问题:字典中键的顺序

码:

d = {'a': 0, 'b': 1, 'c': 2}
l = d.keys()

print l

打印['a', 'c', 'b']。我不确定该方法如何keys()确定l中关键字的顺序。但是,我希望能够以“适当”的顺序检索关键字。当然,正确的顺序将创建列表['a', 'b', 'c']

Code:

d = {'a': 0, 'b': 1, 'c': 2}
l = d.keys()

print l

This prints ['a', 'c', 'b']. I’m unsure of how the method keys() determines the order of the keywords within l. However, I’d like to be able to retrive the keywords in the “proper” order. The proper order of course would create the list ['a', 'b', 'c'].


回答 0

您可以使用OrderedDict(需要Python 2.7)或更高版本。

另外,请注意,OrderedDict({'a': 1, 'b':2, 'c':3})由于dict您使用进行创建的操作{...}已经忘记了元素的顺序,因此该操作无效。相反,您想使用OrderedDict([('a', 1), ('b', 2), ('c', 3)])

如文档中所述,对于低于python 2.7的版本,可以使用配方。

You could use OrderedDict (requires Python 2.7) or higher.

Also, note that OrderedDict({'a': 1, 'b':2, 'c':3}) won’t work since the dict you create with {...} has already forgotten the order of the elements. Instead, you want to use OrderedDict([('a', 1), ('b', 2), ('c', 3)]).

As mentioned in the documentation, for versions lower than Python 2.7, you can use this recipe.


回答 1

Python 3.7以上

在Python 3.7.0中dict对象的插入顺序保留性质已声明为Python语言规范的正式组成部分。因此,您可以依靠它。

Python 3.6(CPython)

从Python 3.6开始,对于Python的CPython实现,字典默认情况下保持插入顺序。但是,这被认为是实现细节。collections.OrderedDict如果希望在其他Python实现中保证插入顺序,则仍应使用。

Python> = 2.7和<3.6

collections.OrderedDict当您需要dict记住插入项目顺序的时,请使用该类。

Python 3.7+

In Python 3.7.0 the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec. Therefore, you can depend on it.

Python 3.6 (CPython)

As of Python 3.6, for the CPython implementation of Python, dictionaries maintain insertion order by default. This is considered an implementation detail though; you should still use collections.OrderedDict if you want insertion ordering that’s guaranteed across other implementations of Python.

Python >=2.7 and <3.6

Use the collections.OrderedDict class when you need a dict that remembers the order of items inserted.


回答 2

>>> print sorted(d.keys())
['a', 'b', 'c']

使用sorted函数,它对传入的可迭代对象进行排序。

.keys()方法以任意顺序返回键。

>>> print sorted(d.keys())
['a', 'b', 'c']

Use the sorted function, which sorts the iterable passed in.

The .keys() method returns the keys in an arbitrary order.


回答 3

只需在要使用列表时对其进行排序。

l = sorted(d.keys())

Just sort the list when you want to use it.

l = sorted(d.keys())

回答 4

来自http://docs.python.org/tutorial/datastructures.html

“字典对象的keys()方法以任意顺序返回字典中使用的所有键的列表(如果要对其排序,只需对其应用sorted()函数)。”

From http://docs.python.org/tutorial/datastructures.html:

“The keys() method of a dictionary object returns a list of all the keys used in the dictionary, in arbitrary order (if you want it sorted, just apply the sorted() function to it).”


回答 5

尽管顺序无关紧要,因为字典是哈希图。这取决于将其推入的顺序:

s = 'abbc'
a = 'cbab'

def load_dict(s):
    dict_tmp = {}
    for ch in s:
        if ch in dict_tmp.keys():
            dict_tmp[ch]+=1
        else:
            dict_tmp[ch] = 1
    return dict_tmp

dict_a = load_dict(a)
dict_s = load_dict(s)
print('for string %s, the keys are %s'%(s, dict_s.keys()))
print('for string %s, the keys are %s'%(a, dict_a.keys()))

输出:
对于字符串abbc,键为字符串cbab的dict_keys([‘a’,’b’,’c’])
对于密钥cbab,键为dict_keys([‘c’,’b’,’a’])

Although the order does not matter as the dictionary is hashmap. It depends on the order how it is pushed in:

s = 'abbc'
a = 'cbab'

def load_dict(s):
    dict_tmp = {}
    for ch in s:
        if ch in dict_tmp.keys():
            dict_tmp[ch]+=1
        else:
            dict_tmp[ch] = 1
    return dict_tmp

dict_a = load_dict(a)
dict_s = load_dict(s)
print('for string %s, the keys are %s'%(s, dict_s.keys()))
print('for string %s, the keys are %s'%(a, dict_a.keys()))

output:
for string abbc, the keys are dict_keys([‘a’, ‘b’, ‘c’])
for string cbab, the keys are dict_keys([‘c’, ‘b’, ‘a’])


Python:检查一个字典是否是另一个较大字典的子集

问题:Python:检查一个字典是否是另一个较大字典的子集

我正在尝试编写一个自定义过滤器方法,该方法接受任意数量的kwargs并返回一个列表,其中包含包含这些kwargs的类似数据库的列表的元素。

例如,假设d1 = {'a':'2', 'b':'3'}d2=相同。d1 == d2结果为True。但是,假设d2=同一件事,再加上一堆其他事情。我的方法需要能够判断d1是否在d2中,但是Python无法使用字典来做到这一点。

内容:

我有一个字类,并且每个对象都有类似的属性worddefinitionpart_of_speech,等等。我希望能够在这些单词的主列表上调用filter方法,例如Word.objects.filter(word='jump', part_of_speech='verb-intransitive')。我无法弄清楚如何同时管理这些键和值。但是,对于其他人来说,这可能具有更大的功能。

I’m trying to write a custom filter method that takes an arbitrary number of kwargs and returns a list containing the elements of a database-like list that contain those kwargs.

For example, suppose d1 = {'a':'2', 'b':'3'} and d2 = the same thing. d1 == d2 results in True. But suppose d2 = the same thing plus a bunch of other things. My method needs to be able to tell if d1 in d2, but Python can’t do that with dictionaries.

Context:

I have a Word class, and each object has properties like word, definition, part_of_speech, and so on. I want to be able to call a filter method on the main list of these words, like Word.objects.filter(word='jump', part_of_speech='verb-intransitive'). I can’t figure out how to manage these keys and values at the same time. But this could have larger functionality outside this context for other people.


回答 0

转换为项目对并检查是否包含。

all(item in superset.items() for item in subset.items())

优化留给读者练习。

Convert to item pairs and check for containment.

all(item in superset.items() for item in subset.items())

Optimization is left as an exercise for the reader.


回答 1

在Python 3中,您可以dict.items()用来获取字典项的类似集合的视图。然后,您可以使用<=运算符来测试一个视图是否为另一个视图的“子集”:

d1.items() <= d2.items()

在Python 2.7中,使用dict.viewitems()进行相同的操作:

d1.viewitems() <= d2.viewitems()

在Python 2.6及以下版本中,您将需要其他解决方案,例如使用all()

all(key in d2 and d2[key] == d1[key] for key in d1)

In Python 3, you can use dict.items() to get a set-like view of the dict items. You can then use the <= operator to test if one view is a “subset” of the other:

d1.items() <= d2.items()

In Python 2.7, use the dict.viewitems() to do the same:

d1.viewitems() <= d2.viewitems()

In Python 2.6 and below you will need a different solution, such as using all():

all(key in d2 and d2[key] == d1[key] for key in d1)

回答 2

对于需要进行单元测试的人请注意:assertDictContainsSubset()Python的TestCase类中还有一个方法。

http://docs.python.org/2/library/unittest.html?highlight=assertdictcontainssubset#unittest.TestCase.assertDictContainsSubset

但是在3.2中已弃用它,不知道为什么,也许有替代品。

Note for people that need this for unit testing: there’s also an assertDictContainsSubset() method in Python’s TestCase class.

http://docs.python.org/2/library/unittest.html?highlight=assertdictcontainssubset#unittest.TestCase.assertDictContainsSubset

It’s however deprecated in 3.2, not sure why, maybe there’s a replacement for it.


回答 3

对于键和值,请检查使用: set(d1.items()).issubset(set(d2.items()))

如果您只需要检查按键: set(d1).issubset(set(d2))

for keys and values check use: set(d1.items()).issubset(set(d2.items()))

if you need to check only keys: set(d1).issubset(set(d2))


回答 4

为了完整起见,您还可以执行以下操作:

def is_subdict(small, big):
    return dict(big, **small) == big

但是,对于速度(或缺乏速度)或可读性(或缺乏可读性),我不做任何主张。

For completeness, you can also do this:

def is_subdict(small, big):
    return dict(big, **small) == big

However, I make no claims whatsoever concerning speed (or lack thereof) or readability (or lack thereof).


回答 5

>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems())
True

上下文:

>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> list(d1.iteritems())
[('a', '2'), ('b', '3')]
>>> [(k,v) for k,v in d1.iteritems()]
[('a', '2'), ('b', '3')]
>>> k,v = ('a','2')
>>> k
'a'
>>> v
'2'
>>> k in d2
True
>>> d2[k]
'2'
>>> k in d2 and d2[k]==v
True
>>> [(k in d2 and d2[k]==v) for k,v in d1.iteritems()]
[True, True]
>>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems())
<generator object <genexpr> at 0x02A9D2B0>
>>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems()).next()
True
>>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems())
True
>>>
>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems())
True

context:

>>> d1 = {'a':'2', 'b':'3'}
>>> d2 = {'a':'2', 'b':'3','c':'4'}
>>> list(d1.iteritems())
[('a', '2'), ('b', '3')]
>>> [(k,v) for k,v in d1.iteritems()]
[('a', '2'), ('b', '3')]
>>> k,v = ('a','2')
>>> k
'a'
>>> v
'2'
>>> k in d2
True
>>> d2[k]
'2'
>>> k in d2 and d2[k]==v
True
>>> [(k in d2 and d2[k]==v) for k,v in d1.iteritems()]
[True, True]
>>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems())
<generator object <genexpr> at 0x02A9D2B0>
>>> ((k in d2 and d2[k]==v) for k,v in d1.iteritems()).next()
True
>>> all((k in d2 and d2[k]==v) for k,v in d1.iteritems())
True
>>>

回答 6

我的函数出于相同的目的,递归地执行此操作:

def dictMatch(patn, real):
    """does real dict match pattern?"""
    try:
        for pkey, pvalue in patn.iteritems():
            if type(pvalue) is dict:
                result = dictMatch(pvalue, real[pkey])
                assert result
            else:
                assert real[pkey] == pvalue
                result = True
    except (AssertionError, KeyError):
        result = False
    return result

在您的示例中,dictMatch(d1, d2)即使d2中包含其他内容,也应返回True,而且它也适用于较低级别:

d1 = {'a':'2', 'b':{3: 'iii'}}
d2 = {'a':'2', 'b':{3: 'iii', 4: 'iv'},'c':'4'}

dictMatch(d1, d2)   # True

注意:可能有更好的解决方案,可以避免使用该if type(pvalue) is dict子句,并适用于更广泛的情况(例如哈希列表等)。递归也不受限制,因此后果自负。;)

My function for the same purpose, doing this recursively:

def dictMatch(patn, real):
    """does real dict match pattern?"""
    try:
        for pkey, pvalue in patn.iteritems():
            if type(pvalue) is dict:
                result = dictMatch(pvalue, real[pkey])
                assert result
            else:
                assert real[pkey] == pvalue
                result = True
    except (AssertionError, KeyError):
        result = False
    return result

In your example, dictMatch(d1, d2) should return True even if d2 has other stuff in it, plus it applies also to lower levels:

d1 = {'a':'2', 'b':{3: 'iii'}}
d2 = {'a':'2', 'b':{3: 'iii', 4: 'iv'},'c':'4'}

dictMatch(d1, d2)   # True

Notes: There could be even better solution which avoids the if type(pvalue) is dict clause and applies to even wider range of cases (like lists of hashes etc). Also recursion is not limited here so use at your own risk. ;)


回答 7

这是一个解决方案,也可以正确地递归到词典中包含的列表和集合中。您也可以将其用于包含字典等的列表…

def is_subset(subset, superset):
    if isinstance(subset, dict):
        return all(key in superset and is_subset(val, superset[key]) for key, val in subset.items())

    if isinstance(subset, list) or isinstance(subset, set):
        return all(any(is_subset(subitem, superitem) for superitem in superset) for subitem in subset)

    # assume that subset is a plain value if none of the above match
    return subset == superset

Here is a solution that also properly recurses into lists and sets contained within the dictionary. You can also use this for lists containing dicts etc…

def is_subset(subset, superset):
    if isinstance(subset, dict):
        return all(key in superset and is_subset(val, superset[key]) for key, val in subset.items())

    if isinstance(subset, list) or isinstance(subset, set):
        return all(any(is_subset(subitem, superitem) for superitem in superset) for subitem in subset)

    # assume that subset is a plain value if none of the above match
    return subset == superset

回答 8

这个看似简单的问题使我花费了几个小时的研究时间才能找到100%可靠的解决方案,因此我记录了在此答案中发现的内容。

  1. 用“ Pythonic-ally”来讲,small_dict <= big_dict这将是最直观的方法,但是很糟糕,它不起作用{'a': 1} < {'a': 1, 'b': 2}似乎可以在Python 2中使用,但是它不可靠,因为官方文档明确指出了这一点。继续搜索“除平等以外的其他结果均得到一致解决,但没有其他定义。” 在这一节。更不用说,比较Python 3中的2个字典会导致TypeError异常。

  2. 第二个最直观的东西是small.viewitems() <= big.viewitems()仅适用于Python 2.7和small.items() <= big.items()Python3。但是有一个警告:它可能有bug。如果您的程序可以在<= 2.6的Python上使用,则它d1.items() <= d2.items()实际上是在比较2个元组列表,没有特定的顺序,因此最终结果将不可靠,并且将成为程序中的一个讨厌的bug。我不希望为Python <= 2.6编写另一种实现,但是我仍然不满意我的代码带有一个已知的错误(即使它在不受支持的平台上)。所以我放弃了这种方法。

  3. 我用@blubberdiblub的答案安定下来(信誉归他所有):

    def is_subdict(small, big): return dict(big, **small) == big

    值得指出的是,这个答案依赖于==字典之间的行为,这在官方文档中已明确定义,因此应该在每个Python版本中都适用。去搜索:

    • “只有并且当它们具有相同的(键,值)对时,字典的比较才相等。” 是本页的最后一句话
    • “映射(dict的实例)在且仅当它们具有相等的(键,值)对时比较相等。键和元素的相等比较会增强自反性。” 在此页面

This seemingly straightforward issue costs me a couple hours in research to find a 100% reliable solution, so I documented what I’ve found in this answer.

  1. “Pythonic-ally” speaking, small_dict <= big_dict would be the most intuitive way, but too bad that it won’t work. {'a': 1} < {'a': 1, 'b': 2} seemingly works in Python 2, but it is not reliable because the official documention explicitly calls it out. Go search “Outcomes other than equality are resolved consistently, but are not otherwise defined.” in this section. Not to mention, comparing 2 dicts in Python 3 results in a TypeError exception.

  2. The second most-intuitive thing is small.viewitems() <= big.viewitems() for Python 2.7 only, and small.items() <= big.items() for Python 3. But there is one caveat: it is potentially buggy. If your program could potentially be used on Python <=2.6, its d1.items() <= d2.items() are actually comparing 2 lists of tuples, without particular order, so the final result will be unreliable and it becomes a nasty bug in your program. I am not keen to write yet another implementation for Python<=2.6, but I still don’t feel comfortable that my code comes with a known bug (even if it is on an unsupported platform). So I abandon this approach.

  3. I settle down with @blubberdiblub ‘s answer (Credit goes to him):

    def is_subdict(small, big): return dict(big, **small) == big

    It is worth pointing out that, this answer relies on the == behavior between dicts, which is clearly defined in official document, hence should work in every Python version. Go search:

    • “Dictionaries compare equal if and only if they have the same (key, value) pairs.” is the last sentence in this page
    • “Mappings (instances of dict) compare equal if and only if they have equal (key, value) pairs. Equality comparison of the keys and elements enforces reflexivity.” in this page

回答 9

这是针对给定问题的一般递归解决方案:

import traceback
import unittest

def is_subset(superset, subset):
    for key, value in subset.items():
        if key not in superset:
            return False

        if isinstance(value, dict):
            if not is_subset(superset[key], value):
                return False

        elif isinstance(value, str):
            if value not in superset[key]:
                return False

        elif isinstance(value, list):
            if not set(value) <= set(superset[key]):
                return False
        elif isinstance(value, set):
            if not value <= superset[key]:
                return False

        else:
            if not value == superset[key]:
                return False

    return True


class Foo(unittest.TestCase):

    def setUp(self):
        self.dct = {
            'a': 'hello world',
            'b': 12345,
            'c': 1.2345,
            'd': [1, 2, 3, 4, 5],
            'e': {1, 2, 3, 4, 5},
            'f': {
                'a': 'hello world',
                'b': 12345,
                'c': 1.2345,
                'd': [1, 2, 3, 4, 5],
                'e': {1, 2, 3, 4, 5},
                'g': False,
                'h': None
            },
            'g': False,
            'h': None,
            'question': 'mcve',
            'metadata': {}
        }

    def tearDown(self):
        pass

    def check_true(self, superset, subset):
        return self.assertEqual(is_subset(superset, subset), True)

    def check_false(self, superset, subset):
        return self.assertEqual(is_subset(superset, subset), False)

    def test_simple_cases(self):
        self.check_true(self.dct, {'a': 'hello world'})
        self.check_true(self.dct, {'b': 12345})
        self.check_true(self.dct, {'c': 1.2345})
        self.check_true(self.dct, {'d': [1, 2, 3, 4, 5]})
        self.check_true(self.dct, {'e': {1, 2, 3, 4, 5}})
        self.check_true(self.dct, {'f': {
            'a': 'hello world',
            'b': 12345,
            'c': 1.2345,
            'd': [1, 2, 3, 4, 5],
            'e': {1, 2, 3, 4, 5},
        }})
        self.check_true(self.dct, {'g': False})
        self.check_true(self.dct, {'h': None})

    def test_tricky_cases(self):
        self.check_true(self.dct, {'a': 'hello'})
        self.check_true(self.dct, {'d': [1, 2, 3]})
        self.check_true(self.dct, {'e': {3, 4}})
        self.check_true(self.dct, {'f': {
            'a': 'hello world',
            'h': None
        }})
        self.check_false(
            self.dct, {'question': 'mcve', 'metadata': {'author': 'BPL'}})
        self.check_true(
            self.dct, {'question': 'mcve', 'metadata': {}})
        self.check_false(
            self.dct, {'question1': 'mcve', 'metadata': {}})

if __name__ == "__main__":
    unittest.main()

注:原来的代码将无法在某些情况下,学分固定@奥利维尔- melançon

Here’s a general recursive solution for the problem given:

import traceback
import unittest

def is_subset(superset, subset):
    for key, value in subset.items():
        if key not in superset:
            return False

        if isinstance(value, dict):
            if not is_subset(superset[key], value):
                return False

        elif isinstance(value, str):
            if value not in superset[key]:
                return False

        elif isinstance(value, list):
            if not set(value) <= set(superset[key]):
                return False
        elif isinstance(value, set):
            if not value <= superset[key]:
                return False

        else:
            if not value == superset[key]:
                return False

    return True


class Foo(unittest.TestCase):

    def setUp(self):
        self.dct = {
            'a': 'hello world',
            'b': 12345,
            'c': 1.2345,
            'd': [1, 2, 3, 4, 5],
            'e': {1, 2, 3, 4, 5},
            'f': {
                'a': 'hello world',
                'b': 12345,
                'c': 1.2345,
                'd': [1, 2, 3, 4, 5],
                'e': {1, 2, 3, 4, 5},
                'g': False,
                'h': None
            },
            'g': False,
            'h': None,
            'question': 'mcve',
            'metadata': {}
        }

    def tearDown(self):
        pass

    def check_true(self, superset, subset):
        return self.assertEqual(is_subset(superset, subset), True)

    def check_false(self, superset, subset):
        return self.assertEqual(is_subset(superset, subset), False)

    def test_simple_cases(self):
        self.check_true(self.dct, {'a': 'hello world'})
        self.check_true(self.dct, {'b': 12345})
        self.check_true(self.dct, {'c': 1.2345})
        self.check_true(self.dct, {'d': [1, 2, 3, 4, 5]})
        self.check_true(self.dct, {'e': {1, 2, 3, 4, 5}})
        self.check_true(self.dct, {'f': {
            'a': 'hello world',
            'b': 12345,
            'c': 1.2345,
            'd': [1, 2, 3, 4, 5],
            'e': {1, 2, 3, 4, 5},
        }})
        self.check_true(self.dct, {'g': False})
        self.check_true(self.dct, {'h': None})

    def test_tricky_cases(self):
        self.check_true(self.dct, {'a': 'hello'})
        self.check_true(self.dct, {'d': [1, 2, 3]})
        self.check_true(self.dct, {'e': {3, 4}})
        self.check_true(self.dct, {'f': {
            'a': 'hello world',
            'h': None
        }})
        self.check_false(
            self.dct, {'question': 'mcve', 'metadata': {'author': 'BPL'}})
        self.check_true(
            self.dct, {'question': 'mcve', 'metadata': {}})
        self.check_false(
            self.dct, {'question1': 'mcve', 'metadata': {}})

if __name__ == "__main__":
    unittest.main()

NOTE: The original code would fail in certain cases, credits for the fixing goes to @olivier-melançon


回答 10

如果您不介意使用pydash 那里is_match,那确实可以做到:

import pydash

a = {1:2, 3:4, 5:{6:7}}
b = {3:4.0, 5:{6:8}}
c = {3:4.0, 5:{6:7}}

pydash.predicates.is_match(a, b) # False
pydash.predicates.is_match(a, c) # True

If you don’t mind using pydash there is is_match there which does exactly that:

import pydash

a = {1:2, 3:4, 5:{6:7}}
b = {3:4.0, 5:{6:8}}
c = {3:4.0, 5:{6:7}}

pydash.predicates.is_match(a, b) # False
pydash.predicates.is_match(a, c) # True

回答 11

我知道这个问题很旧,但是这是我的解决方案,用于检查一个嵌套字典是否是另一个嵌套字典的一部分。解决方案是递归的。

def compare_dicts(a, b):
    for key, value in a.items():
        if key in b:
            if isinstance(a[key], dict):
                if not compare_dicts(a[key], b[key]):
                    return False
            elif value != b[key]:
                return False
        else:
            return False
    return True

I know this question is old, but here is my solution for checking if one nested dictionary is a part of another nested dictionary. The solution is recursive.

def compare_dicts(a, b):
    for key, value in a.items():
        if key in b:
            if isinstance(a[key], dict):
                if not compare_dicts(a[key], b[key]):
                    return False
            elif value != b[key]:
                return False
        else:
            return False
    return True

回答 12

此函数适用于非哈希值。我也认为它清晰易读。

def isSubDict(subDict,dictionary):
    for key in subDict.keys():
        if (not key in dictionary) or (not subDict[key] == dictionary[key]):
            return False
    return True

In [126]: isSubDict({1:2},{3:4})
Out[126]: False

In [127]: isSubDict({1:2},{1:2,3:4})
Out[127]: True

In [128]: isSubDict({1:{2:3}},{1:{2:3},3:4})
Out[128]: True

In [129]: isSubDict({1:{2:3}},{1:{2:4},3:4})
Out[129]: False

This function works for non-hashable values. I also think that it is clear and easy to read.

def isSubDict(subDict,dictionary):
    for key in subDict.keys():
        if (not key in dictionary) or (not subDict[key] == dictionary[key]):
            return False
    return True

In [126]: isSubDict({1:2},{3:4})
Out[126]: False

In [127]: isSubDict({1:2},{1:2,3:4})
Out[127]: True

In [128]: isSubDict({1:{2:3}},{1:{2:3},3:4})
Out[128]: True

In [129]: isSubDict({1:{2:3}},{1:{2:4},3:4})
Out[129]: False

回答 13

一个适用于嵌套字典的简短递归实现:

def compare_dicts(a,b):
    if not a: return True
    if isinstance(a, dict):
        key, val = a.popitem()
        return isinstance(b, dict) and key in b and compare_dicts(val, b.pop(key)) and compare_dicts(a, b)
    return a == b

这将消耗a和b字典。如果有人知道避免这种情况的好方法,而又不像其他答案那样采用部分迭代的解决方案,请告诉我。我需要一种基于键将字典拆分为头部和尾部的方法。

这段代码作为编程练习更有用,并且可能比此处混合递归和迭代的其他解决方案要慢得多。@Nutcracker的解决方案对于嵌套字典非常有用。

A short recursive implementation that works for nested dictionaries:

def compare_dicts(a,b):
    if not a: return True
    if isinstance(a, dict):
        key, val = a.popitem()
        return isinstance(b, dict) and key in b and compare_dicts(val, b.pop(key)) and compare_dicts(a, b)
    return a == b

This will consume the a and b dicts. If anyone knows of a good way to avoid that without resorting to partially iterative solutions as in other answers, please tell me. I would need a way to split a dict into head and tail based on a key.

This code is more usefull as a programming exercise, and probably is a lot slower than other solutions in here that mix recursion and iteration. @Nutcracker’s solution is pretty good for nested dictionaries.


为什么我不能在Python中使用列表作为字典键?

问题:为什么我不能在Python中使用列表作为字典键?

对于什么可以/不能用作python dict的键,我有些困惑。

dicked = {}
dicked[None] = 'foo'     # None ok
dicked[(1,3)] = 'baz'    # tuple ok
import sys
dicked[sys] = 'bar'      # wow, even a module is ok !
dicked[(1,[3])] = 'qux'  # oops, not allowed

因此,元组是一个不可变的类型,但是如果我在其中隐藏一个列表,那么它就不能成为键。.我不能像在模块内部一样轻松地隐藏一个列表吗?

我有一个模糊的想法,认为密钥必须是“可哈希的”,但是我只是承认自己对技术细节的无知。我不知道这里到底发生了什么。如果您尝试将列表用作键,而将哈希作为其存储位置,那会出什么问题呢?

I’m a bit confused about what can/can’t be used as a key for a python dict.

dicked = {}
dicked[None] = 'foo'     # None ok
dicked[(1,3)] = 'baz'    # tuple ok
import sys
dicked[sys] = 'bar'      # wow, even a module is ok !
dicked[(1,[3])] = 'qux'  # oops, not allowed

So a tuple is an immutable type but if I hide a list inside of it, then it can’t be a key.. couldn’t I just as easily hide a list inside a module?

I had some vague idea that that the key has to be “hashable” but I’m just going to admit my own ignorance about the technical details; I don’t know what’s really going on here. What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?


回答 0

Python Wiki中有一篇关于该主题的好文章:为什么列表不能成为字典键。如此处所述:

如果您尝试将列表用作键,而将哈希作为其存储位置,那会出什么问题呢?

可以在不真正破坏任何要求的情况下完成此操作,但是会导致意外的行为。通常将列表视为其值是从其内容的值派生的,例如在检查(不等式)时。可以理解的是,许多人希望您可以使用任何列表[1, 2]来获取相同的键,而您必须在其中保留完全相同的列表对象。但是,一旦修改了用作键的列表,按值查找就会中断,并且要通过标识查找,您需要保持完全相同的列表-这不需要任何其他常见的列表操作(至少我不能想到) )。

object无论如何,其他对象(例如模块)都会通过它们的对象标识产生更大的影响(这是您最后一次有两个不同的名为sys?的模块对象),并且无论如何都要进行比较。因此,当它们用作dict键时,在这种情况下也按标识进行比较就不足为奇了-甚至没有想到。

There’s a good article on the topic in the Python wiki: Why Lists Can’t Be Dictionary Keys. As explained there:

What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?

It can be done without really breaking any of the requirements, but it leads to unexpected behavior. Lists are generally treated as if their value was derived from their content’s values, for instance when checking (in-)equality. Many would – understandably – expect that you can use any list [1, 2] to get the same key, where you’d have to keep around exactly the same list object. But lookup by value breaks as soon as a list used as key is modified, and for lookup by identity requires you to keep around exactly the same list – which isn’t requires for any other common list operation (at least none I can think of).

Other objects such as modules and object make a much bigger deal out of their object identity anyway (when was the last time you had two distinct module objects called sys?), and are compared by that anyway. Therefore, it’s less surprising – or even expected – that they, when used as dict keys, compare by identity in that case as well.


回答 1

为什么我不能在Python中使用列表作为字典键?

>>> d = {repr([1,2,3]): 'value'}
{'[1, 2, 3]': 'value'}

(对于任何偶然发现此问题以寻求解决方案的人)

正如这里其他人所解释的,实际上您不能。但是,如果您确实要使用列表,则可以使用其字符串表示形式。

Why can’t I use a list as a dict key in python?

>>> d = {repr([1,2,3]): 'value'}
{'[1, 2, 3]': 'value'}

(for anybody who stumbles on this question looking for a way around it)

as explained by others here, indeed you cannot. You can however use its string representation instead if you really want to use your list.


回答 2

刚发现您可以将List更改为元组,然后将其用作键。

d = {tuple([1,2,3]): 'value'}

Just found you can change List into tuple, then use it as keys.

d = {tuple([1,2,3]): 'value'}

回答 3

问题在于元组是不可变的,而列表不是。考虑以下

d = {}
li = [1,2,3]
d[li] = 5
li.append(4)

应该d[li]返回什么?是相同的清单吗?怎么d[[1,2,3]]样 它具有相同的值,但列表不同吗?

最终,没有令人满意的答案。例如,如果唯一起作用的键是原始键,那么如果您没有对该键的引用,则无法再访问该值。使用其他所有允许的密钥,您可以构造一个密钥,而无需参考原始密钥。

如果我的两个建议都起作用,那么您将拥有非常不同的键,它们返回相同的值,这有点令人惊讶。如果仅原始内容有效,则您的密钥将很快失效,因为已修改了列表。

The issue is that tuples are immutable, and lists are not. Consider the following

d = {}
li = [1,2,3]
d[li] = 5
li.append(4)

What should d[li] return? Is it the same list? How about d[[1,2,3]]? It has the same values, but is a different list?

Ultimately, there is no satisfactory answer. For example, if the only key that works is the original key, then if you have no reference to that key, you can never again access the value. With every other allowed key, you can construct a key without a reference to the original.

If both of my suggestions work, then you have very different keys that return the same value, which is more than a little surprising. If only the original contents work, then your key will quickly go bad, since lists are made to be modified.


回答 4

这是一个答案http://wiki.python.org/moin/DictionaryKeys

如果您尝试将列表用作键,而将哈希作为其存储位置,那会出什么问题呢?

查找具有相同内容的不同列表将产生不同的结果,即使比较具有相同内容的列表也将它们视为等效。

在字典查找中使用列表文字怎么办?

Here’s an answer http://wiki.python.org/moin/DictionaryKeys

What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?

Looking up different lists with the same contents would produce different results, even though comparing lists with the same contents would indicate them as equivalent.

What about Using a list literal in a dictionary lookup?


回答 5

您的遮阳篷可以在这里找到:

为什么列表不能成为字典键

Python的新手常常想知道为什么,尽管语言既包含元组又包含列表类型,但是元组可用作字典键,而列表却不可用。这是一个经过深思熟虑的设计决定,可以通过首先了解Python词典的工作方式来最好地解释。

来源和更多信息:http : //wiki.python.org/moin/DictionaryKeys

Your awnser can be found here:

Why Lists Can’t Be Dictionary Keys

Newcomers to Python often wonder why, while the language includes both a tuple and a list type, tuples are usable as a dictionary keys, while lists are not. This was a deliberate design decision, and can best be explained by first understanding how Python dictionaries work.

Source & more info: http://wiki.python.org/moin/DictionaryKeys


回答 6

因为列表是可变的,所以dict键(和set成员)必须是可哈希的,并且对可变对象进行哈希处理是一个坏主意,因为哈希值基于实例属性进行计算。

在这个答案中,我将给出一些具体的例子,希望在现有答案的基础上增加价值。每个洞察力也适用于数据set结构的元素。

示例1:哈希可变对象,其中哈希值基于对象的可变特性。

>>> class stupidlist(list):
...     def __hash__(self):
...         return len(self)
... 
>>> stupid = stupidlist([1, 2, 3])
>>> d = {stupid: 0}
>>> stupid.append(4)
>>> stupid
[1, 2, 3, 4]
>>> d
{[1, 2, 3, 4]: 0}
>>> stupid in d
False
>>> stupid in d.keys()
False
>>> stupid in list(d.keys())
True

突变后stupid,不能在字典不再因为散列变化发现。仅对字典的键列表进行线性扫描才能找到stupid

例2:…但是为什么不只是一个恒定的哈希值?

>>> class stupidlist2(list):
...     def __hash__(self):
...         return id(self)
... 
>>> stupidA = stupidlist2([1, 2, 3])
>>> stupidB = stupidlist2([1, 2, 3])
>>> 
>>> stupidA == stupidB
True
>>> stupidA in {stupidB: 0}
False

这也不是一个好主意,因为相等的对象应该相同地散列,以便您可以在 dict或中set

例子3:…好吧,在所有实例中保持不变的哈希值呢?

>>> class stupidlist3(list):
...     def __hash__(self):
...         return 1
... 
>>> stupidC = stupidlist3([1, 2, 3])
>>> stupidD = stupidlist3([1, 2, 3])
>>> stupidE = stupidlist3([1, 2, 3, 4])
>>> 
>>> stupidC in {stupidD: 0}
True
>>> stupidC in {stupidE: 0}
False
>>> d = {stupidC: 0}
>>> stupidC.append(5)
>>> stupidC in d
True

事情似乎按预期工作,但是请考虑发生了什么:当类的所有实例产生相同的哈希值时,只要一个实例中有两个以上的实例作为键,您就会发生哈希冲突。 dict或存在set

使用my_dict[key]key in my_dict(或item in my_set)需要执行stupidlist3与字典键中实例相同的次数相等的检查(在最坏的情况下)。在这一点上,字典的目的-O(1)查找-被完全击败了。以下时间(使用IPython完成)对此进行了演示。

示例3的一些时间

>>> lists_list = [[i]  for i in range(1000)]
>>> stupidlists_set = {stupidlist3([i]) for i in range(1000)}
>>> tuples_set = {(i,) for i in range(1000)}
>>> l = [999]
>>> s = stupidlist3([999])
>>> t = (999,)
>>> 
>>> %timeit l in lists_list
25.5 µs ± 442 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit s in stupidlists_set
38.5 µs ± 61.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit t in tuples_set
77.6 ns ± 1.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

如您所见,我们的成员资格测试stupidlists_set比整个范围的线性扫描要慢lists_list,而您在一组没有哈希冲突的情况下拥有预期的超快查找时间(因子500)。


TL; DR:您可以将其tuple(yourlist)用作dict键,因为元组是不可变且可哈希的。

Because lists are mutable, dict keys (and set members) need to be hashable, and hashing mutable objects is a bad idea because hash values should be computed on the basis of instance attributes.

In this answer, I will give some concrete examples, hopefully adding value on top of the existing answers. Every insight applies to the elements of the set datastructure as well.

Example 1: hashing a mutable object where the hash value is based on a mutable characteristic of the object.

>>> class stupidlist(list):
...     def __hash__(self):
...         return len(self)
... 
>>> stupid = stupidlist([1, 2, 3])
>>> d = {stupid: 0}
>>> stupid.append(4)
>>> stupid
[1, 2, 3, 4]
>>> d
{[1, 2, 3, 4]: 0}
>>> stupid in d
False
>>> stupid in d.keys()
False
>>> stupid in list(d.keys())
True

After mutating stupid, it cannot be found in the dict any longer because the hash changed. Only a linear scan over the list of the dict’s keys finds stupid.

Example 2: … but why not just a constant hash value?

>>> class stupidlist2(list):
...     def __hash__(self):
...         return id(self)
... 
>>> stupidA = stupidlist2([1, 2, 3])
>>> stupidB = stupidlist2([1, 2, 3])
>>> 
>>> stupidA == stupidB
True
>>> stupidA in {stupidB: 0}
False

That’s not a good idea as well because equal objects should hash identically such that you can find them in a dict or set.

Example 3: … ok, what about constant hashes across all instances?!

>>> class stupidlist3(list):
...     def __hash__(self):
...         return 1
... 
>>> stupidC = stupidlist3([1, 2, 3])
>>> stupidD = stupidlist3([1, 2, 3])
>>> stupidE = stupidlist3([1, 2, 3, 4])
>>> 
>>> stupidC in {stupidD: 0}
True
>>> stupidC in {stupidE: 0}
False
>>> d = {stupidC: 0}
>>> stupidC.append(5)
>>> stupidC in d
True

Things seem to work as expected, but think about what’s happening: when all instances of your class produce the same hash value, you will have a hash collision whenever there are more than two instances as keys in a dict or present in a set.

Finding the right instance with my_dict[key] or key in my_dict (or item in my_set) needs to perform as many equality checks as there are instances of stupidlist3 in the dict’s keys (in the worst case). At this point, the purpose of the dictionary – O(1) lookup – is completely defeated. This is demonstrated in the following timings (done with IPython).

Some Timings for Example 3

>>> lists_list = [[i]  for i in range(1000)]
>>> stupidlists_set = {stupidlist3([i]) for i in range(1000)}
>>> tuples_set = {(i,) for i in range(1000)}
>>> l = [999]
>>> s = stupidlist3([999])
>>> t = (999,)
>>> 
>>> %timeit l in lists_list
25.5 µs ± 442 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit s in stupidlists_set
38.5 µs ± 61.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit t in tuples_set
77.6 ns ± 1.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

As you can see, the membership test in our stupidlists_set is even slower than a linear scan over the whole lists_list, while you have the expected super fast lookup time (factor 500) in a set without loads of hash collisions.


TL; DR: you can use tuple(yourlist) as dict keys, because tuples are immutable and hashable.


回答 7

您问题的简单答案是,类列表未实现方法散列,该散列对于任何希望用作字典中键的对象都是必需的。但是散列的原因不相同方式实现它在说,元组类(基于容器的内容)是因为列表是可变的,以便编辑列表将需要散列重新计算,这可能意味着在列表中现在位于基础哈希表中的错误存储桶中。请注意,由于您无法修改元组(不可变的),因此不会遇到此问题。

附带说明,dictobjects查找的实际实现基于Knuth Vol。的算法D。3秒 6.4。如果您有这本书,那么可能值得一读,此外,如果您真的非常有兴趣,则可以在这里查看开发人员对dictobject实际实现的评论。它详细介绍了它的工作原理。您可能也对感兴趣的字典的实现有一个python讲座。它们遍历了键的定义以及前几分钟的哈希值。

The simple answer to your question is that the class list does not implement the method hash which is required for any object which wishes to be used as a key in a dictionary. However the reason why hash is not implemented the same way it is in say the tuple class (based on the content of the container) is because a list is mutable so editing the list would require the hash to be recalculated which may mean the list in now located in the wrong bucket within the underling hash table. Note that since you cannot modify a tuple (immutable) it doesn’t run into this problem.

As a side note, the actual implementation of the dictobjects lookup is based on Algorithm D from Knuth Vol. 3, Sec. 6.4. If you have that book available to you it might be a worthwhile read, in addition if you’re really, really interested you may like to take a peek at the developer comments on the actual implementation of dictobject here. It goes into great detail as to exactly how it works. There is also a python lecture on the implementation of dictionaries which you may be interested in. They go through the definition of a key and what a hash is in the first few minutes.


回答 8

根据Python 2.7.2文档:

如果对象的哈希值在其生命周期内不发生变化(需要使用hash()方法),并且可以与其他对象进行比较(需要使用eq()或cmp()方法),则该对象是可哈希的。比较相等的可哈希对象必须具有相同的哈希值。

散列性使对象可用作字典键和set成员,因为这些数据结构在内部使用散列值。

Python的所有不可变内置对象都是可哈希的,而没有可变容器(例如列表或字典)是可哈希的。作为用户定义类实例的对象默认情况下是可哈希的;它们都比较不相等,并且其哈希值是其id()。

从不能添加,删除或替换其元素的意义上说,元组是不可变的,但是元素本身可能是可变的。列表的哈希值取决于其元素的哈希值,因此当您更改元素时它也会改变。

对列表散列使用id意味着所有列表的比较方式不同,这将令人惊讶且不便。

According to the Python 2.7.2 documentation:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

A tuple is immutable in the sense that you cannot add, remove or replace its elements, but the elements themselves may be mutable. List’s hash value depends on the hash values of its elements, and so it changes when you change the elements.

Using id’s for list hashes would imply that all lists compare differently, which would be surprising and inconvenient.


回答 9

字典是一个HashMap,它存储您的键的映射,将值转换为哈希的新键以及值映射。

类似于(伪代码):

{key : val}  
hash(key) = val

如果您想知道哪些可用选项可以用作字典的键。然后

任何可散列的内容(可以转换为散列,并保持静态值,即不可变,以形成如上所述的散列键)均符合条件,但是列表或集合对象可以随时随地变化,因此hash(key)也应只是为了与您的列表或集合同步而变化。

你可以试试 :

hash(<your key here>)

如果工作正常,则可以将其用作字典的键,也可以将其转换为可哈希的值。


简而言之 :

  1. 将该列表转换为tuple(<your list>)
  2. 将该列表转换为str(<your list>)

A Dictionary is a HashMap it stores map of your keys, value converted to a hashed new key and value mapping.

something like (psuedo code):

{key : val}  
hash(key) = val

If you are wondering which are available options that can be used as key for your dictionary. Then

anything which is hashable(can be converted to hash, and hold static value i.e immutable so as to make a hashed key as stated above) is eligible but as list or set objects can be vary on the go so hash(key) should also needs to vary just to be in sync with your list or set.

You can try :

hash(<your key here>)

If it works fine it can be used as key for your dictionary or else convert it to something hashable.


Inshort :

  1. Convert that list to tuple(<your list>).
  2. Convert that list to str(<your list>).

回答 10

dict键必须是可哈希的。列表是可变的,它们不提供有效的哈希方法。

dict keys need to be hashable. Lists are Mutable and they do not provide a valid hash method.


我应该使用类还是字典?

问题:我应该使用类还是字典?

我有一个只包含字段而没有方法的类,如下所示:

class Request(object):

    def __init__(self, environ):
        self.environ = environ
        self.request_method = environ.get('REQUEST_METHOD', None)
        self.url_scheme = environ.get('wsgi.url_scheme', None)
        self.request_uri = wsgiref.util.request_uri(environ)
        self.path = environ.get('PATH_INFO', None)
        # ...

这可以很容易地翻译成字典。该类对于将来的添加更加灵活,使用可以更快__slots__。那么使用dict会有好处吗?字典会比全班更快吗?并且比具有插槽的类快吗?

I have a class that contains only fields and no methods, like this:

class Request(object):

    def __init__(self, environ):
        self.environ = environ
        self.request_method = environ.get('REQUEST_METHOD', None)
        self.url_scheme = environ.get('wsgi.url_scheme', None)
        self.request_uri = wsgiref.util.request_uri(environ)
        self.path = environ.get('PATH_INFO', None)
        # ...

This could easily be translated to a dict. The class is more flexible for future additions and could be fast with __slots__. So would there be a benefit of using a dict instead? Would a dict be faster than a class? And faster than a class with slots?


回答 0

你为什么要把它当作字典?有什么好处?如果您以后想要添加一些代码,会发生什么?您的__init__代码会去哪儿?

类用于捆绑相关数据(通常是代码)。

字典用于存储键-值关系,其中通常键都是同一类型,并且所有值也都是一种类型。有时,当键/属性名称并非一开始就为人所知时,它们对于捆绑数据很有用,但这通常表明您的设计有问题。

保持这堂课。

Why would you make this a dictionary? What’s the advantage? What happens if you later want to add some code? Where would your __init__ code go?

Classes are for bundling related data (and usually code).

Dictionaries are for storing key-value relationships, where usually the keys are all of the same type, and all the values are also of one type. Occasionally they can be useful for bundling data when the key/attribute names are not all known up front, but often this a sign that something’s wrong with your design.

Keep this a class.


回答 1

使用字典,除非您需要类的额外机制。您还可以将a namedtuple用作混合方法:

>>> from collections import namedtuple
>>> request = namedtuple("Request", "environ request_method url_scheme")
>>> request
<class '__main__.Request'>
>>> request.environ = "foo"
>>> request.environ
'foo'

这里的性能差异将是最小的,尽管如果字典速度不快,我会感到惊讶。

Use a dictionary unless you need the extra mechanism of a class. You could also use a namedtuple for a hybrid approach:

>>> from collections import namedtuple
>>> request = namedtuple("Request", "environ request_method url_scheme")
>>> request
<class '__main__.Request'>
>>> request.environ = "foo"
>>> request.environ
'foo'

Performance differences here will be minimal, although I would be surprised if the dictionary wasn’t faster.


回答 2

python 的类下面的字典。类的行为确实会增加一些开销,但是如果没有事件探查器,您将无法注意到它。在这种情况下,我相信您会从课堂中受益,因为:

  • 您所有的逻辑都存在于一个功能中
  • 易于更新并保持封装
  • 如果以后更改任何内容,则可以轻松地使界面保持不变

A class in python is a dict underneath. You do get some overhead with the class behavior, but you won’t be able to notice it without a profiler. In this case, I believe you benefit from the class because:

  • All your logic lives in a single function
  • It is easy to update and stays encapsulated
  • If you change anything later, you can easily keep the interface the same

回答 3

我认为每个人的用法都太主观,我无法理解,所以我只会坚持数字。

我比较了在dict,new_style类和带槽的new_style类中创建和更改变量所需的时间。

这是我用来测试的代码(虽然有点杂乱,但确实可以完成工作。)

import timeit

class Foo(object):

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

def create_dict():

    foo_dict = {}
    foo_dict['foo1'] = 'test'
    foo_dict['foo2'] = 'test'
    foo_dict['foo3'] = 'test'

    return foo_dict

class Bar(object):
    __slots__ = ['foo1', 'foo2', 'foo3']

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

tmit = timeit.timeit

print 'Creating...\n'
print 'Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict'))
print 'Class: ' + str(tmit('Foo()', 'from __main__ import Foo'))
print 'Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar'))

print '\nChanging a variable...\n'

print 'Dict: ' + str((tmit('create_dict()[\'foo3\'] = "Changed"', 'from __main__ import create_dict') - tmit('create_dict()', 'from __main__ import create_dict')))
print 'Class: ' + str((tmit('Foo().foo3 = "Changed"', 'from __main__ import Foo') - tmit('Foo()', 'from __main__ import Foo')))
print 'Class with slots: ' + str((tmit('Bar().foo3 = "Changed"', 'from __main__ import Bar') - tmit('Bar()', 'from __main__ import Bar')))

这是输出…

正在建立…

Dict: 0.817466186345
Class: 1.60829183597
Class_with_slots: 1.28776730003

更改变量…

Dict: 0.0735140918748
Class: 0.111714198313
Class_with_slots: 0.10618612142

因此,如果您只是存储变量,则需要速度,并且不需要进行很多计算,因此我建议使用dict(您始终可以使函数看起来像方法)。但是,如果您确实需要类,请记住-始终使用__ slot __

注意:

我测试的“类”有两种 new_style和old_style类。事实证明,old_style类的创建速度更快,但修改速度却较慢(如果要在紧密的循环中创建许多类,则幅度不大,但意义重大(提示:您做错了))。

此外,由于我的计算机较旧且运行缓慢,因此在计算机上创建和更改变量的时间可能会有所不同。确保自己进行测试以查看“真实”结果。

编辑:

后来我测试了namedtuple:我无法修改它,但是创建10000个样本(或类似的东西)花了1.4秒,因此字典确实是最快的。

如果我更改dict函数以包括键和值,并在创建它时返回dict而不是包含dict的变量,则它会给我0.65而不是0.8秒。

class Foo(dict):
    pass

创建就像是一个带有插槽的类,并且更改变量最慢(0.17秒),因此不要使用这些类。求字典(速度)或对象派生的类(“语法糖果”)

I think that the usage of each one is way too subjective for me to get in on that, so i’ll just stick to numbers.

I compared the time it takes to create and to change a variable in a dict, a new_style class and a new_style class with slots.

Here’s the code i used to test it(it’s a bit messy but it does the job.)

import timeit

class Foo(object):

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

def create_dict():

    foo_dict = {}
    foo_dict['foo1'] = 'test'
    foo_dict['foo2'] = 'test'
    foo_dict['foo3'] = 'test'

    return foo_dict

class Bar(object):
    __slots__ = ['foo1', 'foo2', 'foo3']

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

tmit = timeit.timeit

print 'Creating...\n'
print 'Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict'))
print 'Class: ' + str(tmit('Foo()', 'from __main__ import Foo'))
print 'Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar'))

print '\nChanging a variable...\n'

print 'Dict: ' + str((tmit('create_dict()[\'foo3\'] = "Changed"', 'from __main__ import create_dict') - tmit('create_dict()', 'from __main__ import create_dict')))
print 'Class: ' + str((tmit('Foo().foo3 = "Changed"', 'from __main__ import Foo') - tmit('Foo()', 'from __main__ import Foo')))
print 'Class with slots: ' + str((tmit('Bar().foo3 = "Changed"', 'from __main__ import Bar') - tmit('Bar()', 'from __main__ import Bar')))

And here is the output…

Creating…

Dict: 0.817466186345
Class: 1.60829183597
Class_with_slots: 1.28776730003

Changing a variable…

Dict: 0.0735140918748
Class: 0.111714198313
Class_with_slots: 0.10618612142

So, if you’re just storing variables, you need speed, and it won’t require you to do many calculations, i recommend using a dict(you could always just make a function that looks like a method). But, if you really need classes, remember – always use __slots__.

Note:

I tested the ‘Class’ with both new_style and old_style classes. It turns out that old_style classes are faster to create but slower to modify(not by much but significant if you’re creating lots of classes in a tight loop (tip: you’re doing it wrong)).

Also the times for creating and changing variables may differ on your computer since mine is old and slow. Make sure you test it yourself to see the ‘real’ results.

Edit:

I later tested the namedtuple: i can’t modify it but to create the 10000 samples (or something like that) it took 1.4 seconds so the dictionary is indeed the fastest.

If i change the dict function to include the keys and values and to return the dict instead of the variable containing the dict when i create it it gives me 0.65 instead of 0.8 seconds.

class Foo(dict):
    pass

Creating is like a class with slots and changing the variable is the slowest (0.17 seconds) so do not use these classes. go for a dict (speed) or for the class derived from object (‘syntax candy’)


回答 4

我同意@adw。我永远不会用字典来代表“对象”(从OO意义上来说)。词典汇总名称/值对。类代表对象。我已经看到了用字典表示对象的代码,目前尚不清楚事物的实际形状是什么。当某些名称/值不存在时会发生什么?是什么限制了客户什么也没花。或者试图把所有东西都花掉。事物的形状应始终明确定义。

使用Python时,重要的是要有纪律性进行构建,因为该语言为作者提供了多种射击方式。

I agree with @adw. I would never represent an “object” (in an OO sense) with a dictionary. Dictionaries aggregate name/value pairs. Classes represent objects. I’ve seen code where the objects are represented with dictionaries and it’s unclear what the actual shape of the thing is. What happens when certain name/values aren’t there? What restricts the client from putting anything at all in. Or trying to get anything at all out. The shape of the thing should always be clearly defined.

When using Python it is important to build with discipline as the language allows many ways for the author to shoot him/herself in the foot.


回答 5

我会推荐一个类,因为它是与请求有关的各种信息。曾经是使用字典的人,我希望存储的数据本质上会更加相似。我倾向于遵循的一个指导原则是,如果我想遍历整个键-值对集合并执行某些操作,则可以使用字典。否则,数据显然比基本的键->值映射具有更多的结构,这意味着类可能是更好的选择。

因此,坚持上课。

I would recommend a class, as it is all sorts of information involved with a request. Were one to use a dictionary, I’d expect the data stored to be far more similar in nature. A guideline I tend to follow myself is that if I may want to loop over the entire set of key->value pairs and do something, I use a dictionary. Otherwise, the data apparently has far more structure than a basic key->value mapping, meaning a class would likely be a better alternative.

Hence, stick with the class.


回答 6

如果您要实现的只是语法糖果obj.bla = 5而不是obj['bla'] = 5,特别是如果您必须重复很多,那么您可能想要使用一些简单的容器类,如martineaus建议中那样。但是,那里的代码非常肿,并且速度很慢。您可以像这样简单:

class AttrDict(dict):
    """ Syntax candy """
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

切换到namedtuples或class的另一个原因__slots__可能是内存使用率。字典比列表类型需要更多的内存,因此可能需要考虑一下。

无论如何,在您的特定情况下,似乎没有任何动力要退出当前的实现。您似乎没有维护数百万个此类对象,因此不需要列表派生类型。而且它实际上包含内的一些功能逻辑__init__,因此您也不应该使用AttrDict

If all that you want to achive is syntax candy like obj.bla = 5 instead of obj['bla'] = 5, especially if you have to repeat that a lot, you maybe want to use some plain container class as in martineaus suggestion. Nevertheless, the code there is quite bloated and unnecessarily slow. You can keep it simple like that:

class AttrDict(dict):
    """ Syntax candy """
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

Another reason to switch to namedtuples or a class with __slots__ could be memory usage. Dicts require significantly more memory than list types, so this could be a point to think about.

Anyways, in your specific case, there doesn’t seem to be any motivation to switch away from your current implementation. You don’t seem to maintain millions of these objects, so no list-derived-types required. And it’s actually containing some functional logic within the __init__, so you also shouldn’t got with AttrDict.


回答 7

也可能有蛋糕也可以吃。换句话说,您可以创建提供类和字典实例功能的东西。请参阅ActiveState的Dɪᴄᴛɪᴏɴᴀʀʏᴡɪᴛʜᴀᴛᴛʀɪʙᴜᴛᴇ-sᴛʏʟᴇss食谱和有关此方法的注释。

如果您决定使用常规类而不是子类,那么我发现T recipesɪᴍᴘʟᴇᴄᴏʟʟᴇᴄᴛᴏʀᴄᴏʟʟᴇᴄᴛᴏʀᴄᴏʟʟᴇᴄᴛᴏʀrecipe recipe ss的食谱(由Alex Martelli 撰写非常灵活,对此类事情很有用看起来就像您在做的(即创建一个相对简单的信息聚合器)。由于它是一个类,因此您可以通过添加方法轻松地进一步扩展其功能。

最后,应该指出,类成员的名称必须是合法的Python标识符,但字典键则不能—因此字典在这方面将提供更大的自由度,因为键可以是任何可散列的东西(甚至可以不是字符串)。

更新资料

一类object(其不具有__dict__)指定的子类SimpleNamespace(它有一个)加入到该types模块的Python 3.3,并且是又一替代。

It may be possible to have your cake and eat it, too. In other words you can create something that provides the functionality of both a class and dictionary instance. See the ActiveState’s Dɪᴄᴛɪᴏɴᴀʀʏ ᴡɪᴛʜ ᴀᴛᴛʀɪʙᴜᴛᴇ-sᴛʏʟᴇ ᴀᴄᴄᴇss recipe and comments on ways of doing that.

If you decide to use a regular class rather than a subclass, I’ve found the Tʜᴇ sɪᴍᴘʟᴇ ʙᴜᴛ ʜᴀɴᴅʏ “ᴄᴏʟʟᴇᴄᴛᴏʀ ᴏғ ᴀ ʙᴜɴᴄʜ ᴏғ ɴᴀᴍᴇᴅ sᴛᴜғғ” ᴄʟᴀss recipe (by Alex Martelli) to be very flexible and useful for the sort of thing it looks like you’re doing (i.e. create a relative simple aggregator of information). Since it’s a class you can easily extend its functionality further by adding methods.

Lastly it should be noted that the names of class members must be legal Python identifiers, but dictionary keys do not—so a dictionary would provide greater freedom in that regard because keys can be anything hashable (even something that’s not a string).

Update

A class object (which doesn’t have a __dict__) subclass named SimpleNamespace (which does have one) was added to the types module Python 3.3, and is yet another alternative.


回答 8

class ClassWithSlotBase:
    __slots__ = ('a', 'b',)

def __init__(self):
    self.a: str = "test"
    self.b: float = 0.0


def test_type_hint(_b: float) -> None:
    print(_b)


class_tmp = ClassWithSlotBase()

test_type_hint(class_tmp.a)

我推荐一堂课。如果使用类,则可以得到如下所示的类型提示。当class是函数的参数时,Class支持自动完成。

在此处输入图片说明

class ClassWithSlotBase:
    __slots__ = ('a', 'b',)

def __init__(self):
    self.a: str = "test"
    self.b: float = 0.0


def test_type_hint(_b: float) -> None:
    print(_b)


class_tmp = ClassWithSlotBase()

test_type_hint(class_tmp.a)

I recommend a class. If you use a class, you can get type hint as shown. And Class support auto complete when class is argument of function.

enter image description here


将Pandas DataFrame转换为字典

问题:将Pandas DataFrame转换为字典

我有一个包含四列的DataFrame。我想将此DataFrame转换为python字典。我希望第一列keys的元素为,同一行中其他列的元素为values

数据框:

    ID   A   B   C
0   p    1   3   2
1   q    4   3   2
2   r    4   0   9  

输出应如下所示:

字典:

{'p': [1,3,2], 'q': [4,3,2], 'r': [4,0,9]}

I have a DataFrame with four columns. I want to convert this DataFrame to a python dictionary. I want the elements of first column be keys and the elements of other columns in same row be values.

DataFrame:

    ID   A   B   C
0   p    1   3   2
1   q    4   3   2
2   r    4   0   9  

Output should be like this:

Dictionary:

{'p': [1,3,2], 'q': [4,3,2], 'r': [4,0,9]}

回答 0

to_dict()方法将列名设置为字典键,因此您需要稍微调整DataFrame的形状。将“ ID”列设置为索引,然后转置DataFrame是实现此目的的一种方法。

to_dict()还接受一个“东方”参数,您需要该参数才能为每列输出值列表。否则,{index: value}将为每一列返回形式的字典。

这些步骤可以通过以下行完成:

>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

如果需要不同的字典格式,则以下是可能的Orient参数的示例。考虑以下简单的DataFrame:

>>> df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
>>> df
        a      b
0     red  0.500
1  yellow  0.250
2    blue  0.125

然后,选项如下。

dict-默认值:列名称是键,值是index:data对的字典

>>> df.to_dict('dict')
{'a': {0: 'red', 1: 'yellow', 2: 'blue'}, 
 'b': {0: 0.5, 1: 0.25, 2: 0.125}}

list-键是列名,值是列数据的列表

>>> df.to_dict('list')
{'a': ['red', 'yellow', 'blue'], 
 'b': [0.5, 0.25, 0.125]}

系列 -类似于“列表”,但值是系列

>>> df.to_dict('series')
{'a': 0       red
      1    yellow
      2      blue
      Name: a, dtype: object, 

 'b': 0    0.500
      1    0.250
      2    0.125
      Name: b, dtype: float64}

split-将列/数据/索引拆分为键,其值分别是列名,按行和索引标签的数据值

>>> df.to_dict('split')
{'columns': ['a', 'b'],
 'data': [['red', 0.5], ['yellow', 0.25], ['blue', 0.125]],
 'index': [0, 1, 2]}

记录 -每行成为一个字典,其中键是列名,值是单元格中的数据

>>> df.to_dict('records')
[{'a': 'red', 'b': 0.5}, 
 {'a': 'yellow', 'b': 0.25}, 
 {'a': 'blue', 'b': 0.125}]

索引 -类似于“记录”,但是字典的字典以键作为索引标签(而不是列表)

>>> df.to_dict('index')
{0: {'a': 'red', 'b': 0.5},
 1: {'a': 'yellow', 'b': 0.25},
 2: {'a': 'blue', 'b': 0.125}}

The to_dict() method sets the column names as dictionary keys so you’ll need to reshape your DataFrame slightly. Setting the ‘ID’ column as the index and then transposing the DataFrame is one way to achieve this.

to_dict() also accepts an ‘orient’ argument which you’ll need in order to output a list of values for each column. Otherwise, a dictionary of the form {index: value} will be returned for each column.

These steps can be done with the following line:

>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

In case a different dictionary format is needed, here are examples of the possible orient arguments. Consider the following simple DataFrame:

>>> df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
>>> df
        a      b
0     red  0.500
1  yellow  0.250
2    blue  0.125

Then the options are as follows.

dict – the default: column names are keys, values are dictionaries of index:data pairs

>>> df.to_dict('dict')
{'a': {0: 'red', 1: 'yellow', 2: 'blue'}, 
 'b': {0: 0.5, 1: 0.25, 2: 0.125}}

list – keys are column names, values are lists of column data

>>> df.to_dict('list')
{'a': ['red', 'yellow', 'blue'], 
 'b': [0.5, 0.25, 0.125]}

series – like ‘list’, but values are Series

>>> df.to_dict('series')
{'a': 0       red
      1    yellow
      2      blue
      Name: a, dtype: object, 

 'b': 0    0.500
      1    0.250
      2    0.125
      Name: b, dtype: float64}

split – splits columns/data/index as keys with values being column names, data values by row and index labels respectively

>>> df.to_dict('split')
{'columns': ['a', 'b'],
 'data': [['red', 0.5], ['yellow', 0.25], ['blue', 0.125]],
 'index': [0, 1, 2]}

records – each row becomes a dictionary where key is column name and value is the data in the cell

>>> df.to_dict('records')
[{'a': 'red', 'b': 0.5}, 
 {'a': 'yellow', 'b': 0.25}, 
 {'a': 'blue', 'b': 0.125}]

index – like ‘records’, but a dictionary of dictionaries with keys as index labels (rather than a list)

>>> df.to_dict('index')
{0: {'a': 'red', 'b': 0.5},
 1: {'a': 'yellow', 'b': 0.25},
 2: {'a': 'blue', 'b': 0.125}}

回答 1

尝试使用 Zip

df = pd.read_csv("file")
d= dict([(i,[a,b,c ]) for i, a,b,c in zip(df.ID, df.A,df.B,df.C)])
print d

输出:

{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

Try to use Zip

df = pd.read_csv("file")
d= dict([(i,[a,b,c ]) for i, a,b,c in zip(df.ID, df.A,df.B,df.C)])
print d

Output:

{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

回答 2

跟着这些步骤:

假设您的数据框如下:

>>> df
   A  B  C ID
0  1  3  2  p
1  4  3  2  q
2  4  0  9  r

1. set_index用于将ID列设置为数据框索引。

    df.set_index("ID", drop=True, inplace=True)

2.使用orient=index参数将索引用作字典键。

    dictionary = df.to_dict(orient="index")

结果如下:

    >>> dictionary
    {'q': {'A': 4, 'B': 3, 'D': 2}, 'p': {'A': 1, 'B': 3, 'D': 2}, 'r': {'A': 4, 'B': 0, 'D': 9}}

3.如果需要将每个样本作为列表,请运行以下代码。确定列顺序

column_order= ["A", "B", "C"] #  Determine your preferred order of columns
d = {} #  Initialize the new dictionary as an empty dictionary
for k in dictionary:
    d[k] = [dictionary[k][column_name] for column_name in column_order]

Follow these steps:

Suppose your dataframe is as follows:

>>> df
   A  B  C ID
0  1  3  2  p
1  4  3  2  q
2  4  0  9  r

1. Use set_index to set ID columns as the dataframe index.

    df.set_index("ID", drop=True, inplace=True)

2. Use the orient=index parameter to have the index as dictionary keys.

    dictionary = df.to_dict(orient="index")

The results will be as follows:

    >>> dictionary
    {'q': {'A': 4, 'B': 3, 'D': 2}, 'p': {'A': 1, 'B': 3, 'D': 2}, 'r': {'A': 4, 'B': 0, 'D': 9}}

3. If you need to have each sample as a list run the following code. Determine the column order

column_order= ["A", "B", "C"] #  Determine your preferred order of columns
d = {} #  Initialize the new dictionary as an empty dictionary
for k in dictionary:
    d[k] = [dictionary[k][column_name] for column_name in column_order]

回答 3

如果您不介意字典值是元组,则可以使用itertuples:

>>> {x[0]: x[1:] for x in df.itertuples(index=False)}
{'p': (1, 3, 2), 'q': (4, 3, 2), 'r': (4, 0, 9)}

If you don’t mind the dictionary values being tuples, you can use itertuples:

>>> {x[0]: x[1:] for x in df.itertuples(index=False)}
{'p': (1, 3, 2), 'q': (4, 3, 2), 'r': (4, 0, 9)}

回答 4

字典应该像:

{'red': '0.500', 'yellow': '0.250, 'blue': '0.125'}

需要像这样的数据框之外:

        a      b
0     red  0.500
1  yellow  0.250
2    blue  0.125

最简单的方法是:

dict(df.values.tolist())

下面的工作片段:

import pandas as pd
df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
dict(df.values.tolist())

在此处输入图片说明

should a dictionary like:

{'red': '0.500', 'yellow': '0.250, 'blue': '0.125'}

be required out of a dataframe like:

        a      b
0     red  0.500
1  yellow  0.250
2    blue  0.125

simplest way would be to do:

dict(df.values.tolist())

working snippet below:

import pandas as pd
df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
dict(df.values.tolist())

enter image description here


回答 5

对于我的使用(具有xy位置的节点名称),我发现@ user4179775对最有用/最直观的答案:

import pandas as pd

df = pd.read_csv('glycolysis_nodes_xy.tsv', sep='\t')

df.head()
    nodes    x    y
0  c00033  146  958
1  c00031  601  195
...

xy_dict_list=dict([(i,[a,b]) for i, a,b in zip(df.nodes, df.x,df.y)])

xy_dict_list
{'c00022': [483, 868],
 'c00024': [146, 868],
 ... }

xy_dict_tuples=dict([(i,(a,b)) for i, a,b in zip(df.nodes, df.x,df.y)])

xy_dict_tuples
{'c00022': (483, 868),
 'c00024': (146, 868),
 ... }

附录

后来,我又回到了这个问题,进行其他但相关的工作。这是一种更接近于[优秀]公认答案的方法。

node_df = pd.read_csv('node_prop-glycolysis_tca-from_pg.tsv', sep='\t')

node_df.head()
   node  kegg_id kegg_cid            name  wt  vis
0  22    22       c00022   pyruvate        1   1
1  24    24       c00024   acetyl-CoA      1   1
...

将Pandas数据帧转换为[列表],{dict},{dict的{dict}},…

每个接受的答案:

node_df.set_index('kegg_cid').T.to_dict('list')

{'c00022': [22, 22, 'pyruvate', 1, 1],
 'c00024': [24, 24, 'acetyl-CoA', 1, 1],
 ... }

node_df.set_index('kegg_cid').T.to_dict('dict')

{'c00022': {'kegg_id': 22, 'name': 'pyruvate', 'node': 22, 'vis': 1, 'wt': 1},
 'c00024': {'kegg_id': 24, 'name': 'acetyl-CoA', 'node': 24, 'vis': 1, 'wt': 1},
 ... }

就我而言,我想做同样的事情,但要选择Pandas数据框中的列,因此我需要对列进行切片。有两种方法。

  1. 直:

(请参阅:将大熊猫转换为字典,以定义用于键值的列

node_df.set_index('kegg_cid')[['name', 'wt', 'vis']].T.to_dict('dict')

{'c00022': {'name': 'pyruvate', 'vis': 1, 'wt': 1},
 'c00024': {'name': 'acetyl-CoA', 'vis': 1, 'wt': 1},
 ... }
  1. “间接:”首先,从Pandas数据框中切片所需的列/数据(同样,两种方法),
node_df_sliced = node_df[['kegg_cid', 'name', 'wt', 'vis']]

要么

node_df_sliced2 = node_df.loc[:, ['kegg_cid', 'name', 'wt', 'vis']]

然后可以用来创建字典的字典

node_df_sliced.set_index('kegg_cid').T.to_dict('dict')

{'c00022': {'name': 'pyruvate', 'vis': 1, 'wt': 1},
 'c00024': {'name': 'acetyl-CoA', 'vis': 1, 'wt': 1},
 ... }

For my use (node names with xy positions) I found @user4179775’s answer to the most helpful / intuitive:

import pandas as pd

df = pd.read_csv('glycolysis_nodes_xy.tsv', sep='\t')

df.head()
    nodes    x    y
0  c00033  146  958
1  c00031  601  195
...

xy_dict_list=dict([(i,[a,b]) for i, a,b in zip(df.nodes, df.x,df.y)])

xy_dict_list
{'c00022': [483, 868],
 'c00024': [146, 868],
 ... }

xy_dict_tuples=dict([(i,(a,b)) for i, a,b in zip(df.nodes, df.x,df.y)])

xy_dict_tuples
{'c00022': (483, 868),
 'c00024': (146, 868),
 ... }

Addendum

I later returned to this issue, for other, but related, work. Here is an approach that more closely mirrors the [excellent] accepted answer.

node_df = pd.read_csv('node_prop-glycolysis_tca-from_pg.tsv', sep='\t')

node_df.head()
   node  kegg_id kegg_cid            name  wt  vis
0  22    22       c00022   pyruvate        1   1
1  24    24       c00024   acetyl-CoA      1   1
...

Convert Pandas dataframe to a [list], {dict}, {dict of {dict}}, …

Per accepted answer:

node_df.set_index('kegg_cid').T.to_dict('list')

{'c00022': [22, 22, 'pyruvate', 1, 1],
 'c00024': [24, 24, 'acetyl-CoA', 1, 1],
 ... }

node_df.set_index('kegg_cid').T.to_dict('dict')

{'c00022': {'kegg_id': 22, 'name': 'pyruvate', 'node': 22, 'vis': 1, 'wt': 1},
 'c00024': {'kegg_id': 24, 'name': 'acetyl-CoA', 'node': 24, 'vis': 1, 'wt': 1},
 ... }

In my case, I wanted to do the same thing but with selected columns from the Pandas dataframe, so I needed to slice the columns. There are two approaches.

  1. Directly:

(see: Convert pandas to dictionary defining the columns used fo the key values)

node_df.set_index('kegg_cid')[['name', 'wt', 'vis']].T.to_dict('dict')

{'c00022': {'name': 'pyruvate', 'vis': 1, 'wt': 1},
 'c00024': {'name': 'acetyl-CoA', 'vis': 1, 'wt': 1},
 ... }
  1. “Indirectly:” first, slice the desired columns/data from the Pandas dataframe (again, two approaches),
node_df_sliced = node_df[['kegg_cid', 'name', 'wt', 'vis']]

or

node_df_sliced2 = node_df.loc[:, ['kegg_cid', 'name', 'wt', 'vis']]

that can then can be used to create a dictionary of dictionaries

node_df_sliced.set_index('kegg_cid').T.to_dict('dict')

{'c00022': {'name': 'pyruvate', 'vis': 1, 'wt': 1},
 'c00024': {'name': 'acetyl-CoA', 'vis': 1, 'wt': 1},
 ... }

回答 6

DataFrame.to_dict() 将DataFrame转换为字典。

>>> df = pd.DataFrame(
    {'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['a', 'b'])
>>> df
   col1  col2
a     1   0.1
b     2   0.2
>>> df.to_dict()
{'col1': {'a': 1, 'b': 2}, 'col2': {'a': 0.5, 'b': 0.75}}

有关详细信息,请参见此文档

DataFrame.to_dict() converts DataFrame to dictionary.

Example

>>> df = pd.DataFrame(
    {'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['a', 'b'])
>>> df
   col1  col2
a     1   0.1
b     2   0.2
>>> df.to_dict()
{'col1': {'a': 1, 'b': 2}, 'col2': {'a': 0.5, 'b': 0.75}}

See this Documentation for details


Python中dict.clear()与分配{}之间的区别

问题:Python中dict.clear()与分配{}之间的区别

在python中,调用clear()和分配{}给字典之间有区别吗?如果是,那是什么?例:

d = {"stuff":"things"}
d.clear()   #this way
d = {}      #vs this way

In python, is there a difference between calling clear() and assigning {} to a dictionary? If yes, what is it? Example:

d = {"stuff":"things"}
d.clear()   #this way
d = {}      #vs this way

回答 0

如果您还有另一个变量也引用相同的字典,则有很大的不同:

>>> d = {"stuff": "things"}
>>> d2 = d
>>> d = {}
>>> d2
{'stuff': 'things'}
>>> d = {"stuff": "things"}
>>> d2 = d
>>> d.clear()
>>> d2
{}

这是因为分配d = {}会创建一个新的空字典并将其分配给d变量。这样就d2指向旧字典,里面还有项目。但是,d.clear()清除相同的字典,d并且d2两者都指向。

If you have another variable also referring to the same dictionary, there is a big difference:

>>> d = {"stuff": "things"}
>>> d2 = d
>>> d = {}
>>> d2
{'stuff': 'things'}
>>> d = {"stuff": "things"}
>>> d2 = d
>>> d.clear()
>>> d2
{}

This is because assigning d = {} creates a new, empty dictionary and assigns it to the d variable. This leaves d2 pointing at the old dictionary with items still in it. However, d.clear() clears the same dictionary that d and d2 both point at.


回答 1

d = {}将为创建新实例,d但所有其他引用仍将指向旧内容。 d.clear()将重置内容,但是对同一实例的所有引用仍然正确。

d = {} will create a new instance for d but all other references will still point to the old contents. d.clear() will reset the contents, but all references to the same instance will still be correct.


回答 2

除了其他答案中提到的差异外,还存在速度差异。d = {}的速度是原来的两倍:

python -m timeit -s "d = {}" "for i in xrange(500000): d.clear()"
10 loops, best of 3: 127 msec per loop

python -m timeit -s "d = {}" "for i in xrange(500000): d = {}"
10 loops, best of 3: 53.6 msec per loop

In addition to the differences mentioned in other answers, there also is a speed difference. d = {} is over twice as fast:

python -m timeit -s "d = {}" "for i in xrange(500000): d.clear()"
10 loops, best of 3: 127 msec per loop

python -m timeit -s "d = {}" "for i in xrange(500000): d = {}"
10 loops, best of 3: 53.6 msec per loop

回答 3

为了说明前面已经提到的事情:

>>> a = {1:2}
>>> id(a)
3073677212L
>>> a.clear()
>>> id(a)
3073677212L
>>> a = {}
>>> id(a)
3073675716L

As an illustration for the things already mentioned before:

>>> a = {1:2}
>>> id(a)
3073677212L
>>> a.clear()
>>> id(a)
3073677212L
>>> a = {}
>>> id(a)
3073675716L

回答 4

除了@odano的答案外,d.clear()如果您想多次清除字典,使用起来似乎更快。

import timeit

p1 = ''' 
d = {}
for i in xrange(1000):
    d[i] = i * i
for j in xrange(100):
    d = {}
    for i in xrange(1000):
        d[i] = i * i
'''

p2 = ''' 
d = {}
for i in xrange(1000):
    d[i] = i * i
for j in xrange(100):
    d.clear()
    for i in xrange(1000):
        d[i] = i * i
'''

print timeit.timeit(p1, number=1000)
print timeit.timeit(p2, number=1000)

结果是:

20.0367929935
19.6444659233

In addition to @odano ‘s answer, it seems using d.clear() is faster if you would like to clear the dict for many times.

import timeit

p1 = ''' 
d = {}
for i in xrange(1000):
    d[i] = i * i
for j in xrange(100):
    d = {}
    for i in xrange(1000):
        d[i] = i * i
'''

p2 = ''' 
d = {}
for i in xrange(1000):
    d[i] = i * i
for j in xrange(100):
    d.clear()
    for i in xrange(1000):
        d[i] = i * i
'''

print timeit.timeit(p1, number=1000)
print timeit.timeit(p2, number=1000)

The result is:

20.0367929935
19.6444659233

回答 5

如果原始对象不在范围内,则突变方法总是有用的:

def fun(d):
    d.clear()
    d["b"] = 2

d={"a": 2}
fun(d)
d          # {'b': 2}

重新分配字典将创建一个新对象,而不会修改原始对象。

Mutating methods are always useful if the original object is not in scope:

def fun(d):
    d.clear()
    d["b"] = 2

d={"a": 2}
fun(d)
d          # {'b': 2}

Re-assigning the dictionary would create a new object and wouldn’t modify the original one.


回答 6

未提及的一件事是范围界定问题。这不是一个很好的例子,但是在这种情况下,我遇到了问题:

def conf_decorator(dec):
    """Enables behavior like this:
        @threaded
        def f(): ...

        or

        @threaded(thread=KThread)
        def f(): ...

        (assuming threaded is wrapped with this function.)
        Sends any accumulated kwargs to threaded.
        """
    c_kwargs = {}
    @wraps(dec)
    def wrapped(f=None, **kwargs):
        if f:
            r = dec(f, **c_kwargs)
            c_kwargs = {}
            return r
        else:
            c_kwargs.update(kwargs) #<- UnboundLocalError: local variable 'c_kwargs' referenced before assignment
            return wrapped
    return wrapped

解决方案是替换c_kwargs = {}c_kwargs.clear()

如果有人想出一个更实际的例子,请随时编辑此帖子。

One thing not mentioned is scoping issues. Not a great example, but here’s the case where I ran into the problem:

def conf_decorator(dec):
    """Enables behavior like this:
        @threaded
        def f(): ...

        or

        @threaded(thread=KThread)
        def f(): ...

        (assuming threaded is wrapped with this function.)
        Sends any accumulated kwargs to threaded.
        """
    c_kwargs = {}
    @wraps(dec)
    def wrapped(f=None, **kwargs):
        if f:
            r = dec(f, **c_kwargs)
            c_kwargs = {}
            return r
        else:
            c_kwargs.update(kwargs) #<- UnboundLocalError: local variable 'c_kwargs' referenced before assignment
            return wrapped
    return wrapped

The solution is to replace c_kwargs = {} with c_kwargs.clear()

If someone thinks up a more practical example, feel free to edit this post.


回答 7

另外,有时dict实例可能是dict的子类(defaultdict例如)。在这种情况下,clear首选使用using ,因为我们不必记住dict的确切类型,并且还避免重复代码(将清除行与初始化行耦合)。

x = defaultdict(list)
x[1].append(2)
...
x.clear() # instead of the longer x = defaultdict(list)

In addition, sometimes the dict instance might be a subclass of dict (defaultdict for example). In that case, using clear is preferred, as we don’t have to remember the exact type of the dict, and also avoid duplicate code (coupling the clearing line with the initialization line).

x = defaultdict(list)
x[1].append(2)
...
x.clear() # instead of the longer x = defaultdict(list)

如何在Python中逐行打印字典?

问题:如何在Python中逐行打印字典?

这是字典

cars = {'A':{'speed':70,
        'color':2},
        'B':{'speed':60,
        'color':3}}

使用这个 for loop

for keys,values in cars.items():
    print(keys)
    print(values)

它打印以下内容:

B
{'color': 3, 'speed': 60}
A
{'color': 2, 'speed': 70}

但是我希望程序像这样打印它:

B
color : 3
speed : 60
A
color : 2
speed : 70

我刚刚开始学习字典,所以不确定如何执行此操作。

This is the dictionary

cars = {'A':{'speed':70,
        'color':2},
        'B':{'speed':60,
        'color':3}}

Using this for loop

for keys,values in cars.items():
    print(keys)
    print(values)

It prints the following:

B
{'color': 3, 'speed': 60}
A
{'color': 2, 'speed': 70}

But I want the program to print it like this:

B
color : 3
speed : 60
A
color : 2
speed : 70

I just started learning dictionaries so I’m not sure how to do this.


回答 0

for x in cars:
    print (x)
    for y in cars[x]:
        print (y,':',cars[x][y])

输出:

A
color : 2
speed : 70
B
color : 3
speed : 60
for x in cars:
    print (x)
    for y in cars[x]:
        print (y,':',cars[x][y])

output:

A
color : 2
speed : 70
B
color : 3
speed : 60

回答 1

您可以json为此使用模块。dumps此模块中的函数将JSON对象转换为格式正确的字符串,然后可以打印该字符串。

import json

cars = {'A':{'speed':70, 'color':2},
        'B':{'speed':60, 'color':3}}

print(json.dumps(cars, indent = 4))

输出看起来像

{
    “一个”: {
        “颜色”:2
        “速度”:70
    },
    “ B”:{
        “颜色”:3,
        “速度”:60
    }
}

文件还规定了一堆这种方法有用的选项。

You could use the json module for this. The dumps function in this module converts a JSON object into a properly formatted string which you can then print.

import json

cars = {'A':{'speed':70, 'color':2},
        'B':{'speed':60, 'color':3}}

print(json.dumps(cars, indent = 4))

The output looks like

{
    "A": {
        "color": 2,
        "speed": 70
    },
    "B": {
        "color": 3,
        "speed": 60
    }
}

The documentation also specifies a bunch of useful options for this method.


回答 2

处理任意深度嵌套的字典和列表的更通用的解决方案是:

def dumpclean(obj):
    if isinstance(obj, dict):
        for k, v in obj.items():
            if hasattr(v, '__iter__'):
                print k
                dumpclean(v)
            else:
                print '%s : %s' % (k, v)
    elif isinstance(obj, list):
        for v in obj:
            if hasattr(v, '__iter__'):
                dumpclean(v)
            else:
                print v
    else:
        print obj

产生输出:

A
color : 2
speed : 70
B
color : 3
speed : 60

我遇到了类似的需求,并开发了更强大的功能作为自己的练习。我将其包含在此处,以防它可能对另一个有价值。在运行鼻子测试中,我还发现能够在调用中指定输出流很有用,这样可以代替使用sys.stderr。

import sys

def dump(obj, nested_level=0, output=sys.stdout):
    spacing = '   '
    if isinstance(obj, dict):
        print >> output, '%s{' % ((nested_level) * spacing)
        for k, v in obj.items():
            if hasattr(v, '__iter__'):
                print >> output, '%s%s:' % ((nested_level + 1) * spacing, k)
                dump(v, nested_level + 1, output)
            else:
                print >> output, '%s%s: %s' % ((nested_level + 1) * spacing, k, v)
        print >> output, '%s}' % (nested_level * spacing)
    elif isinstance(obj, list):
        print >> output, '%s[' % ((nested_level) * spacing)
        for v in obj:
            if hasattr(v, '__iter__'):
                dump(v, nested_level + 1, output)
            else:
                print >> output, '%s%s' % ((nested_level + 1) * spacing, v)
        print >> output, '%s]' % ((nested_level) * spacing)
    else:
        print >> output, '%s%s' % (nested_level * spacing, obj)

使用此功能,OP的输出如下所示:

{
   A:
   {
      color: 2
      speed: 70
   }
   B:
   {
      color: 3
      speed: 60
   }
}

我个人认为这更有用和更具描述性。

给出以下简单的例子:

{"test": [{1:3}], "test2":[(1,2),(3,4)],"test3": {(1,2):['abc', 'def', 'ghi'],(4,5):'def'}}

OP要求的解决方案将产生以下结果:

test
1 : 3
test3
(1, 2)
abc
def
ghi
(4, 5) : def
test2
(1, 2)
(3, 4)

而“增强型”版本会产生以下结果:

{
   test:
   [
      {
         1: 3
      }
   ]
   test3:
   {
      (1, 2):
      [
         abc
         def
         ghi
      ]
      (4, 5): def
   }
   test2:
   [
      (1, 2)
      (3, 4)
   ]
}

我希望这可以为下一个寻求这种功能的人提供一些价值。

A more generalized solution that handles arbitrarily-deeply nested dicts and lists would be:

def dumpclean(obj):
    if isinstance(obj, dict):
        for k, v in obj.items():
            if hasattr(v, '__iter__'):
                print k
                dumpclean(v)
            else:
                print '%s : %s' % (k, v)
    elif isinstance(obj, list):
        for v in obj:
            if hasattr(v, '__iter__'):
                dumpclean(v)
            else:
                print v
    else:
        print obj

This produces the output:

A
color : 2
speed : 70
B
color : 3
speed : 60

I ran into a similar need and developed a more robust function as an exercise for myself. I’m including it here in case it can be of value to another. In running nosetest, I also found it helpful to be able to specify the output stream in the call so that sys.stderr could be used instead.

import sys

def dump(obj, nested_level=0, output=sys.stdout):
    spacing = '   '
    if isinstance(obj, dict):
        print >> output, '%s{' % ((nested_level) * spacing)
        for k, v in obj.items():
            if hasattr(v, '__iter__'):
                print >> output, '%s%s:' % ((nested_level + 1) * spacing, k)
                dump(v, nested_level + 1, output)
            else:
                print >> output, '%s%s: %s' % ((nested_level + 1) * spacing, k, v)
        print >> output, '%s}' % (nested_level * spacing)
    elif isinstance(obj, list):
        print >> output, '%s[' % ((nested_level) * spacing)
        for v in obj:
            if hasattr(v, '__iter__'):
                dump(v, nested_level + 1, output)
            else:
                print >> output, '%s%s' % ((nested_level + 1) * spacing, v)
        print >> output, '%s]' % ((nested_level) * spacing)
    else:
        print >> output, '%s%s' % (nested_level * spacing, obj)

Using this function, the OP’s output looks like this:

{
   A:
   {
      color: 2
      speed: 70
   }
   B:
   {
      color: 3
      speed: 60
   }
}

which I personally found to be more useful and descriptive.

Given the slightly less-trivial example of:

{"test": [{1:3}], "test2":[(1,2),(3,4)],"test3": {(1,2):['abc', 'def', 'ghi'],(4,5):'def'}}

The OP’s requested solution yields this:

test
1 : 3
test3
(1, 2)
abc
def
ghi
(4, 5) : def
test2
(1, 2)
(3, 4)

whereas the ‘enhanced’ version yields this:

{
   test:
   [
      {
         1: 3
      }
   ]
   test3:
   {
      (1, 2):
      [
         abc
         def
         ghi
      ]
      (4, 5): def
   }
   test2:
   [
      (1, 2)
      (3, 4)
   ]
}

I hope this provides some value to the next person looking for this type of functionality.


回答 3

您具有嵌套结构,因此您也需要格式化嵌套字典:

for key, car in cars.items():
    print(key)
    for attribute, value in car.items():
        print('{} : {}'.format(attribute, value))

打印:

A
color : 2
speed : 70
B
color : 3
speed : 60

You have a nested structure, so you need to format the nested dictionary too:

for key, car in cars.items():
    print(key)
    for attribute, value in car.items():
        print('{} : {}'.format(attribute, value))

This prints:

A
color : 2
speed : 70
B
color : 3
speed : 60

回答 4

pprint.pprint() 是完成这项工作的好工具:

>>> import pprint
>>> cars = {'A':{'speed':70,
...         'color':2},
...         'B':{'speed':60,
...         'color':3}}
>>> pprint.pprint(cars, width=1)
{'A': {'color': 2,
       'speed': 70},
 'B': {'color': 3,
       'speed': 60}}

pprint.pprint() is a good tool for this job:

>>> import pprint
>>> cars = {'A':{'speed':70,
...         'color':2},
...         'B':{'speed':60,
...         'color':3}}
>>> pprint.pprint(cars, width=1)
{'A': {'color': 2,
       'speed': 70},
 'B': {'color': 3,
       'speed': 60}}

回答 5

for car,info in cars.items():
    print(car)
    for key,value in info.items():
        print(key, ":", value)
for car,info in cars.items():
    print(car)
    for key,value in info.items():
        print(key, ":", value)

回答 6

如果您知道树只有两个级别,这将起作用:

for k1 in cars:
    print(k1)
    d = cars[k1]
    for k2 in d
        print(k2, ':', d[k2])

This will work if you know the tree only has two levels:

for k1 in cars:
    print(k1)
    d = cars[k1]
    for k2 in d
        print(k2, ':', d[k2])

回答 7

检查以下一线:

print('\n'.join("%s\n%s" % (key1,('\n'.join("%s : %r" % (key2,val2) for (key2,val2) in val1.items()))) for (key1,val1) in cars.items()))

输出:

A
speed : 70
color : 2
B
speed : 60
color : 3

Check the following one-liner:

print('\n'.join("%s\n%s" % (key1,('\n'.join("%s : %r" % (key2,val2) for (key2,val2) in val1.items()))) for (key1,val1) in cars.items()))

Output:

A
speed : 70
color : 2
B
speed : 60
color : 3

回答 8

我更喜欢以下格式yaml

import yaml
yaml.dump(cars)

输出:

A:
  color: 2
  speed: 70
B:
  color: 3
  speed: 60

I prefer the clean formatting of yaml:

import yaml
yaml.dump(cars)

output:

A:
  color: 2
  speed: 70
B:
  color: 3
  speed: 60

回答 9

###newbie exact answer desired (Python v3):
###=================================
"""
cars = {'A':{'speed':70,
        'color':2},
        'B':{'speed':60,
        'color':3}}
"""

for keys, values in  reversed(sorted(cars.items())):
    print(keys)
    for keys,values in sorted(values.items()):
        print(keys," : ", values)

"""
Output:
B
color  :  3
speed  :  60
A
color  :  2
speed  :  70

##[Finished in 0.073s]
"""
###newbie exact answer desired (Python v3):
###=================================
"""
cars = {'A':{'speed':70,
        'color':2},
        'B':{'speed':60,
        'color':3}}
"""

for keys, values in  reversed(sorted(cars.items())):
    print(keys)
    for keys,values in sorted(values.items()):
        print(keys," : ", values)

"""
Output:
B
color  :  3
speed  :  60
A
color  :  2
speed  :  70

##[Finished in 0.073s]
"""

回答 10

# Declare and Initialize Map
map = {}

map ["New"] = 1
map ["to"] = 1
map ["Python"] = 5
map ["or"] = 2

# Print Statement
for i in map:
  print ("", i, ":", map[i])

#  New : 1
#  to : 1
#  Python : 5
#  or : 2
# Declare and Initialize Map
map = {}

map ["New"] = 1
map ["to"] = 1
map ["Python"] = 5
map ["or"] = 2

# Print Statement
for i in map:
  print ("", i, ":", map[i])

#  New : 1
#  to : 1
#  Python : 5
#  or : 2

回答 11

这是我对问题的解决方案。我认为它的方法类似,但是比其他一些答案要简单一些。它还允许任意数量的子词典,并且似乎适用于任何数据类型(我什至在具有值功能的字典上对其进行了测试):

def pprint(web, level):
    for k,v in web.items():
        if isinstance(v, dict):
            print('\t'*level, f'{k}: ')
            level += 1
            pprint(v, level)
            level -= 1
        else:
            print('\t'*level, k, ": ", v)

Here is my solution to the problem. I think it’s similar in approach, but a little simpler than some of the other answers. It also allows for an arbitrary number of sub-dictionaries and seems to work for any datatype (I even tested it on a dictionary which had functions as values):

def pprint(web, level):
    for k,v in web.items():
        if isinstance(v, dict):
            print('\t'*level, f'{k}: ')
            level += 1
            pprint(v, level)
            level -= 1
        else:
            print('\t'*level, k, ": ", v)

回答 12

修改MrWonderful代码

import sys

def print_dictionary(obj, ident):
    if type(obj) == dict:
        for k, v in obj.items():
            sys.stdout.write(ident)
            if hasattr(v, '__iter__'):
                print k
                print_dictionary(v, ident + '  ')
            else:
                print '%s : %s' % (k, v)
    elif type(obj) == list:
        for v in obj:
            sys.stdout.write(ident)
            if hasattr(v, '__iter__'):
                print_dictionary(v, ident + '  ')
            else:
                print v
    else:
        print obj

Modifying MrWonderful code

import sys

def print_dictionary(obj, ident):
    if type(obj) == dict:
        for k, v in obj.items():
            sys.stdout.write(ident)
            if hasattr(v, '__iter__'):
                print k
                print_dictionary(v, ident + '  ')
            else:
                print '%s : %s' % (k, v)
    elif type(obj) == list:
        for v in obj:
            sys.stdout.write(ident)
            if hasattr(v, '__iter__'):
                print_dictionary(v, ident + '  ')
            else:
                print v
    else:
        print obj