标签归档:dictionary

Python中的“ collection.defaultdict”多个级别

问题:Python中的“ collection.defaultdict”多个级别

感谢SO方面的一些杰出人士,我发现了的可能性collections.defaultdict,尤其是在可读性和速度方面。我让他们成功使用。

现在,我想实现三个级别的字典,两个最大的字典是defaultdict,最低的是int。我找不到执行此操作的适当方法。这是我的尝试:

from collections import defaultdict
d = defaultdict(defaultdict)
a = [("key1", {"a1":22, "a2":33}),
     ("key2", {"a1":32, "a2":55}),
     ("key3", {"a1":43, "a2":44})]
for i in a:
    d[i[0]] = i[1]

现在这可以工作,但是以下是所需的行为,但无效:

d["key4"]["a1"] + 1

我怀疑我应该在某个地方声明第二个级别defaultdict是type int,但是我没有找到在哪里或怎么做。

defaultdict首先使用的原因是避免必须为每个新键初始化字典。

还有更优雅的建议吗?

谢谢pythoneers!

Thanks to some great folks on SO, I discovered the possibilities offered by collections.defaultdict, notably in readability and speed. I have put them to use with success.

Now I would like to implement three levels of dictionaries, the two top ones being defaultdict and the lowest one being int. I don’t find the appropriate way to do this. Here is my attempt:

from collections import defaultdict
d = defaultdict(defaultdict)
a = [("key1", {"a1":22, "a2":33}),
     ("key2", {"a1":32, "a2":55}),
     ("key3", {"a1":43, "a2":44})]
for i in a:
    d[i[0]] = i[1]

Now this works, but the following, which is the desired behavior, doesn’t:

d["key4"]["a1"] + 1

I suspect that I should have declared somewhere that the second level defaultdict is of type int, but I didn’t find where or how to do so.

The reason I am using defaultdict in the first place is to avoid having to initialize the dictionary for each new key.

Any more elegant suggestion?

Thanks pythoneers!


回答 0

用:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

defaultdict(int)只要在中访问新密钥,就会创建一个新密钥d

Use:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

This will create a new defaultdict(int) whenever a new key is accessed in d.


回答 1

使可腌制的嵌套defaultdict的另一种方法是使用部分对象而不是lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

这将起作用,因为defaultdict类可在模块级别全局访问:

“除非对它包装的函数[或在这种情况下,类]可以在其__name__(在其__module__内)全局访问,否则您不能腌制部分对象” – 酸洗包装的部分函数

Another way to make a pickleable, nested defaultdict is to use a partial object instead of a lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

This will work because the defaultdict class is globally accessible at the module level:

“You can’t pickle a partial object unless the function [or in this case, class] it wraps is globally accessible … under its __name__ (within its __module__)” — Pickling wrapped partial functions


回答 2

这里查看nosklo的答案以获得更通用的解决方案。

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

测试:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

输出:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

Look at nosklo’s answer here for a more general solution.

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

Testing:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

Output:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

回答 3

按照@rschwieb的要求D['key'] += 1,我们可以通过定义方法覆盖加法来扩展前一个__add__方法,以使其表现得更像collections.Counter()

首先__missing__将被调用以创建一个新的空值,该值将传递到中__add__。我们测试该值,以空值为False

有关覆盖的更多信息,请参见模拟数字类型

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

例子:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

我们可以只提供默认的0值,然后尝试操作:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

As per @rschwieb’s request for D['key'] += 1, we can expand on previous by overriding addition by defining __add__ method, to make this behave more like a collections.Counter()

First __missing__ will be called to create a new empty value, which will be passed into __add__. We test the value, counting on empty values to be False.

See emulating numeric types for more information on overriding.

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

Examples:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

Rather than checking argument is a Number (very non-python, amirite!) we could just provide a default 0 value and then attempt the operation:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

回答 4

晚会晚了,但是对于任意深度,我只是发现自己在做这样的事情:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

这里的窍门基本上是使DeepDict实例本身成为构造缺失值的有效工厂。现在我们可以做类似的事情

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9

Late to the party, but for arbitrary depth I just found myself doing something like this:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

The trick here is basically to make the DeepDict instance itself a valid factory for constructing missing values. Now we can do things like

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9

回答 5

def _sub_getitem(self, k):
    try:
        # sub.__class__.__bases__[0]
        real_val = self.__class__.mro()[-2].__getitem__(self, k)
        val = '' if real_val is None else real_val
    except Exception:
        val = ''
        real_val = None
    # isinstance(Avoid,dict)也是true,会一直递归死
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
        # 重新赋值当前字典键为返回值,当对其赋值时可回溯
        if all([real_val is not None, isinstance(self, (dict, list)), type(k) is not slice]):
            self[k] = val
    return val


def _sub_pop(self, k=-1):
    try:
        val = self.__class__.mro()[-2].pop(self, k)
        val = '' if val is None else val
    except Exception:
        val = ''
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
    return val


class DefaultDict(dict):
    def __getitem__(self, k):
        return _sub_getitem(self, k)

    def pop(self, k):
        return _sub_pop(self, k)

In[8]: d=DefaultDict()
In[9]: d['a']['b']['c']['d']
Out[9]: ''
In[10]: d['a']="ggggggg"
In[11]: d['a']
Out[11]: 'ggggggg'
In[12]: d['a']['pp']
Out[12]: ''

再没有错误。无论嵌套多少级。弹出也没有错误

dd = DefaultDict({“ 1”:333333})

def _sub_getitem(self, k):
    try:
        # sub.__class__.__bases__[0]
        real_val = self.__class__.mro()[-2].__getitem__(self, k)
        val = '' if real_val is None else real_val
    except Exception:
        val = ''
        real_val = None
    # isinstance(Avoid,dict)也是true,会一直递归死
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
        # 重新赋值当前字典键为返回值,当对其赋值时可回溯
        if all([real_val is not None, isinstance(self, (dict, list)), type(k) is not slice]):
            self[k] = val
    return val


def _sub_pop(self, k=-1):
    try:
        val = self.__class__.mro()[-2].pop(self, k)
        val = '' if val is None else val
    except Exception:
        val = ''
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
    return val


class DefaultDict(dict):
    def __getitem__(self, k):
        return _sub_getitem(self, k)

    def pop(self, k):
        return _sub_pop(self, k)

In[8]: d=DefaultDict()
In[9]: d['a']['b']['c']['d']
Out[9]: ''
In[10]: d['a']="ggggggg"
In[11]: d['a']
Out[11]: 'ggggggg'
In[12]: d['a']['pp']
Out[12]: ''

No errors again. No matter how many levels nested. pop no error also

dd=DefaultDict({“1”:333333})


计算两个Python字典中包含的键的差异

问题:计算两个Python字典中包含的键的差异

假设我有两个Python字典- dictAdictB。我需要找出是否有任何键存在于中,dictB但没有dictA。最快的方法是什么?

我应该将字典键转换为集合然后继续吗?

有兴趣了解您的想法…


感谢您的回复。

很抱歉未能正确说明我的问题。我的情况是这样的-我有一个dictA与可能相同的dictB密钥,或者可能缺少一些密钥,dictB否则某些密钥的值可能会有所不同,必须将其设置为dictA密钥的值。

问题在于字典没有标准,并且可以具有可以作为dict的值。

dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......

因此,必须将“ key2”值重置为新值,并在字典内部添加“ key13”。键值没有固定的格式。它可以是一个简单的值或dict或dict的dict。

Suppose I have two Python dictionaries – dictA and dictB. I need to find out if there are any keys which are present in dictB but not in dictA. What is the fastest way to go about it?

Should I convert the dictionary keys into a set and then go about?

Interested in knowing your thoughts…


Thanks for your responses.

Apologies for not stating my question properly. My scenario is like this – I have a dictA which can be the same as dictB or may have some keys missing as compared to dictB or else the value of some keys might be different which has to be set to that of dictA key’s value.

Problem is the dictionary has no standard and can have values which can be dict of dict.

Say

dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......

So ‘key2’ value has to be reset to the new value and ‘key13’ has to be added inside the dict. The key value does not have a fixed format. It can be a simple value or a dict or a dict of dict.


回答 0

您可以在按键上使用设置操作:

diff = set(dictb.keys()) - set(dicta.keys())

这是一个查找所有可能性的类:添加了什么,删除了什么,哪些键值对相同​​以及哪些键值对已更改。

class DictDiffer(object):
    """
    Calculate the difference between two dictionaries as:
    (1) items added
    (2) items removed
    (3) keys same in both but changed values
    (4) keys same in both and unchanged values
    """
    def __init__(self, current_dict, past_dict):
        self.current_dict, self.past_dict = current_dict, past_dict
        self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
        self.intersect = self.set_current.intersection(self.set_past)
    def added(self):
        return self.set_current - self.intersect 
    def removed(self):
        return self.set_past - self.intersect 
    def changed(self):
        return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
    def unchanged(self):
        return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])

这是一些示例输出:

>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])

可以作为github存储库使用:https : //github.com/hughdbrown/dictdiffer

You can use set operations on the keys:

diff = set(dictb.keys()) - set(dicta.keys())

Here is a class to find all the possibilities: what was added, what was removed, which key-value pairs are the same, and which key-value pairs are changed.

class DictDiffer(object):
    """
    Calculate the difference between two dictionaries as:
    (1) items added
    (2) items removed
    (3) keys same in both but changed values
    (4) keys same in both and unchanged values
    """
    def __init__(self, current_dict, past_dict):
        self.current_dict, self.past_dict = current_dict, past_dict
        self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
        self.intersect = self.set_current.intersection(self.set_past)
    def added(self):
        return self.set_current - self.intersect 
    def removed(self):
        return self.set_past - self.intersect 
    def changed(self):
        return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
    def unchanged(self):
        return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])

Here is some sample output:

>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])

Available as a github repo: https://github.com/hughdbrown/dictdiffer


回答 1

如果您需要递归的区别,我已经为python编写了一个软件包:https : //github.com/seperman/deepdiff

安装

从PyPi安装:

pip install deepdiff

用法示例

输入

>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2

同一对象返回空

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}

项目类型已更改

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
                                 'newvalue': '2',
                                 'oldtype': <class 'int'>,
                                 'oldvalue': 2}}}

项目的价值已更改

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

添加和/或删除项目

>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
 'dic_item_removed': ['root[4]'],
 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

弦差异

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
                      "root[4]['b']": { 'newvalue': 'world!',
                                        'oldvalue': 'world'}}}

弦差异2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
                                                '+++ \n'
                                                '@@ -1,5 +1,4 @@\n'
                                                '-world!\n'
                                                '-Goodbye!\n'
                                                '+world\n'
                                                ' 1\n'
                                                ' 2\n'
                                                ' End',
                                        'newvalue': 'world\n1\n2\nEnd',
                                        'oldvalue': 'world!\n'
                                                    'Goodbye!\n'
                                                    '1\n'
                                                    '2\n'
                                                    'End'}}}

>>> 
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
--- 
+++ 
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
 1
 2
 End

类型变更

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
                                      'newvalue': 'world\n\n\nEnd',
                                      'oldtype': <class 'list'>,
                                      'oldvalue': [1, 2, 3]}}}

清单差异

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}

清单差异2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
  'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
                      "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}

列出差异忽略顺序或重复项:(具有与上述相同的字典)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}

包含字典的列表:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
  'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}

套装:

>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}

命名元组:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}

自定义对象:

>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
... 
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>> 
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

添加对象属性:

>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

In case you want the difference recursively, I have written a package for python: https://github.com/seperman/deepdiff

Installation

Install from PyPi:

pip install deepdiff

Example usage

Importing

>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2

Same object returns empty

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}

Type of an item has changed

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
                                 'newvalue': '2',
                                 'oldtype': <class 'int'>,
                                 'oldvalue': 2}}}

Value of an item has changed

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

Item added and/or removed

>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
 'dic_item_removed': ['root[4]'],
 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

String difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
                      "root[4]['b']": { 'newvalue': 'world!',
                                        'oldvalue': 'world'}}}

String difference 2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
                                                '+++ \n'
                                                '@@ -1,5 +1,4 @@\n'
                                                '-world!\n'
                                                '-Goodbye!\n'
                                                '+world\n'
                                                ' 1\n'
                                                ' 2\n'
                                                ' End',
                                        'newvalue': 'world\n1\n2\nEnd',
                                        'oldvalue': 'world!\n'
                                                    'Goodbye!\n'
                                                    '1\n'
                                                    '2\n'
                                                    'End'}}}

>>> 
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
--- 
+++ 
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
 1
 2
 End

Type change

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
                                      'newvalue': 'world\n\n\nEnd',
                                      'oldtype': <class 'list'>,
                                      'oldvalue': [1, 2, 3]}}}

List difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}

List difference 2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
  'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
                      "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}

List difference ignoring order or duplicates: (with the same dictionaries as above)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}

List that contains dictionary:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
  'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}

Sets:

>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}

Named Tuples:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}

Custom objects:

>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
... 
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>> 
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

Object attribute added:

>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

回答 2

不知道它是否“快速”,但是通常情况下,可以做到这一点

dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
    if not key in dictb:
        print key

not sure whether its “fast” or not, but normally, one can do this

dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
    if not key in dictb:
        print key

回答 3

就像Alex Martelli所写的那样,如果您只想检查B中的任何键是否不在A中,那any(True for k in dictB if k not in dictA)将是您的最佳选择。

要查找缺少的密钥:

diff = set(dictB)-set(dictA) #sets

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =    
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop

diff = [ k for k in dictB if k not in dictA ] #lc

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA = 
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop

因此,这两种解决方案的速度几乎相同。

As Alex Martelli wrote, if you simply want to check if any key in B is not in A, any(True for k in dictB if k not in dictA) would be the way to go.

To find the keys that are missing:

diff = set(dictB)-set(dictA) #sets

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =    
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop

diff = [ k for k in dictB if k not in dictA ] #lc

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA = 
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop

So those two solutions are pretty much the same speed.


回答 4

如果您确实要说的是真的(您只需要找出B中而不是A中“有任何键”的情况,那么ONES可能没有),最快的方法应该是:

if any(True for k in dictB if k not in dictA): ...

如果您实际上需要找出哪个键(如果有)在B中而不是在A中,而不仅仅是“ IF”,那么有这样的键,那么现有的答案就很合适了(但是我建议在以后的问题中更精确一些,如果那是确实是您的意思;-)。

If you really mean exactly what you say (that you only need to find out IF “there are any keys” in B and not in A, not WHICH ONES might those be if any), the fastest way should be:

if any(True for k in dictB if k not in dictA): ...

If you actually need to find out WHICH KEYS, if any, are in B and not in A, and not just “IF” there are such keys, then existing answers are quite appropriate (but I do suggest more precision in future questions if that’s indeed what you mean;-).


回答 5

用途set()

set(dictA.keys()).intersection(dictB.keys())

Use set():

set(dictA.keys()).intersection(dictB.keys())

回答 6

hughdbrown的最高答案是建议使用集差异,这绝对是最好的方法:

diff = set(dictb.keys()) - set(dicta.keys())

这段代码的问题在于,它仅创建两个列表就创建了两个列表,因此浪费了4N的时间和2N的空间。它也比需要的要复杂一些。

通常,这没什么大不了的,但是如果是这样的话:

diff = dictb.keys() - dicta

Python 2

在Python 2中,keys()返回键列表,而不是KeysView。因此,您必须viewkeys()直接提出要求。

diff = dictb.viewkeys() - dicta

对于双版本2.7 / 3.x代码,希望使用six或类似的代码,因此可以使用six.viewkeys(dictb)

diff = six.viewkeys(dictb) - dicta

在2.4-2.6中,没有KeysView。但是,您可以直接从迭代器中构建左集合,而不是先构建列表,至少可以将成本从4N削减到N:

diff = set(dictb) - dicta

物品

我有一个dictA可以与dictB相同,或者与dictB相比可能缺少一些键,否则某些键的值可能不同

因此,您实际上不需要比较键,而是需要比较项。ItemsViewSet当值是可哈希值(例如字符串)时,an 才是a 。如果是这样,这很容易:

diff = dictb.items() - dicta.items()

递归差异

尽管问题不是直接要求递归差异,但某些示例值是dict,并且看来预期的输出确实递归地对它们进行差异。这里已经有多个答案显示了如何执行此操作。

The top answer by hughdbrown suggests using set difference, which is definitely the best approach:

diff = set(dictb.keys()) - set(dicta.keys())

The problem with this code is that it builds two lists just to create two sets, so it’s wasting 4N time and 2N space. It’s also a bit more complicated than it needs to be.

Usually, this is not a big deal, but if it is:

diff = dictb.keys() - dicta

Python 2

In Python 2, keys() returns a list of the keys, not a KeysView. So you have to ask for viewkeys() directly.

diff = dictb.viewkeys() - dicta

For dual-version 2.7/3.x code, you’re hopefully using six or something similar, so you can use six.viewkeys(dictb):

diff = six.viewkeys(dictb) - dicta

In 2.4-2.6, there is no KeysView. But you can at least cut the cost from 4N to N by building your left set directly out of an iterator, instead of building a list first:

diff = set(dictb) - dicta

Items

I have a dictA which can be the same as dictB or may have some keys missing as compared to dictB or else the value of some keys might be different

So you really don’t need to compare the keys, but the items. An ItemsView is only a Set if the values are hashable, like strings. If they are, it’s easy:

diff = dictb.items() - dicta.items()

Recursive diff

Although the question isn’t directly asking for a recursive diff, some of the example values are dicts, and it appears the expected output does recursively diff them. There are already multiple answers here showing how to do that.


回答 7

关于此参数,stackoverflow中还有另一个问题,我不得不承认有一个简单的解决方案:python 的datadiff库有助于打印两个字典之间的差异。

There is an other question in stackoverflow about this argument and i have to admit that there is a simple solution explained: the datadiff library of python helps printing the difference between two dictionaries.


回答 8

这是一种可行的方法,允许将键的值计算为False,并且在可能的情况下仍使用生成器表达式尽早退出。虽然不是特别漂亮。

any(map(lambda x: True, (k for k in b if k not in a)))

编辑:

THC4k发表了对我对另一个答案的评论的回复。这是一种更好,更漂亮的方法来执行上述操作:

any(True for k in b if k not in a)

不知道那怎么没想到…

Here’s a way that will work, allows for keys that evaluate to False, and still uses a generator expression to fall out early if possible. It’s not exceptionally pretty though.

any(map(lambda x: True, (k for k in b if k not in a)))

EDIT:

THC4k posted a reply to my comment on another answer. Here’s a better, prettier way to do the above:

any(True for k in b if k not in a)

Not sure how that never crossed my mind…


回答 9

这是一个古老的问题,要求的问题比我需要的要少,因此,此答案实际上比该问题所要求的要多。这个问题的答案帮助我解决了以下问题:

  1. (要求)记录两个词典之间的差异
  2. 将#1的差异合并到基础词典中
  3. (要求)合并两个字典之间的差异(将第2个字典视为差异字典)
  4. 尝试检测物品的移动和变化
  5. (要求)递归执行所有这些操作

所有这些与JSON相结合,提供了非常强大的配置存储支持。

解决方案(也在github上):

from collections import OrderedDict
from pprint import pprint


class izipDestinationMatching(object):
    __slots__ = ("attr", "value", "index")

    def __init__(self, attr, value, index):
        self.attr, self.value, self.index = attr, value, index

    def __repr__(self):
        return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)


def izip_destination(a, b, attrs, addMarker=True):
    """
    Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
    Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
    When addMarker == False (patching), final size will be the longer of a, b
    """
    for idx, item in enumerate(b):
        try:
            attr = next((x for x in attrs if x in item), None)  # See if the item has any of the ID attributes
            match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
            if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
        except:
            match = None
        yield (match if match else a[idx] if len(a) > idx else None), item
    if not addMarker and len(a) > len(b):
        for item in a[len(b) - len(a):]:
            yield item, item


def dictdiff(a, b, searchAttrs=[]):
    """
    returns a dictionary which represents difference from a to b
    the return dict is as short as possible:
      equal items are removed
      added / changed items are listed
      removed items are listed with value=None
    Also processes list values where the resulting list size will match that of b.
    It can also search said list items (that are dicts) for identity values to detect changed positions.
      In case such identity value is found, it is kept so that it can be re-found during the merge phase
    @param a: original dict
    @param b: new dict
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
        return b
    res = OrderedDict()
    if izipDestinationMatching in b:
        keepKey = b[izipDestinationMatching].attr
        del b[izipDestinationMatching]
    else:
        keepKey = izipDestinationMatching
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res


def dictmerge(a, b, searchAttrs=[]):
    """
    Returns a dictionary which merges differences recorded in b to base dictionary a
    Also processes list values where the resulting list size will match that of a
    It can also search said list items (that are dicts) for identity values to detect changed positions
    @param a: original dict
    @param b: diff dict to patch into a
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
        return b
    res = OrderedDict()
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        #print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
        if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
        elif key not in b: res[key] = v1
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res

This is an old question and asks a little bit less than what I needed so this answer actually solves more than this question asks. The answers in this question helped me solve the following:

  1. (asked) Record differences between two dictionaries
  2. Merge differences from #1 into base dictionary
  3. (asked) Merge differences between two dictionaries (treat dictionary #2 as if it were a diff dictionary)
  4. Try to detect item movements as well as changes
  5. (asked) Do all of this recursively

All this combined with JSON makes for a pretty powerful configuration storage support.

The solution (also on github):

from collections import OrderedDict
from pprint import pprint


class izipDestinationMatching(object):
    __slots__ = ("attr", "value", "index")

    def __init__(self, attr, value, index):
        self.attr, self.value, self.index = attr, value, index

    def __repr__(self):
        return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)


def izip_destination(a, b, attrs, addMarker=True):
    """
    Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
    Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
    When addMarker == False (patching), final size will be the longer of a, b
    """
    for idx, item in enumerate(b):
        try:
            attr = next((x for x in attrs if x in item), None)  # See if the item has any of the ID attributes
            match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
            if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
        except:
            match = None
        yield (match if match else a[idx] if len(a) > idx else None), item
    if not addMarker and len(a) > len(b):
        for item in a[len(b) - len(a):]:
            yield item, item


def dictdiff(a, b, searchAttrs=[]):
    """
    returns a dictionary which represents difference from a to b
    the return dict is as short as possible:
      equal items are removed
      added / changed items are listed
      removed items are listed with value=None
    Also processes list values where the resulting list size will match that of b.
    It can also search said list items (that are dicts) for identity values to detect changed positions.
      In case such identity value is found, it is kept so that it can be re-found during the merge phase
    @param a: original dict
    @param b: new dict
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
        return b
    res = OrderedDict()
    if izipDestinationMatching in b:
        keepKey = b[izipDestinationMatching].attr
        del b[izipDestinationMatching]
    else:
        keepKey = izipDestinationMatching
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res


def dictmerge(a, b, searchAttrs=[]):
    """
    Returns a dictionary which merges differences recorded in b to base dictionary a
    Also processes list values where the resulting list size will match that of a
    It can also search said list items (that are dicts) for identity values to detect changed positions
    @param a: original dict
    @param b: diff dict to patch into a
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
        return b
    res = OrderedDict()
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        #print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
        if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
        elif key not in b: res[key] = v1
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res

回答 10

怎么样标准(比较完整对象)

PyDev->新的PyDev模块->模块:单元测试

import unittest


class Test(unittest.TestCase):


    def testName(self):
        obj1 = {1:1, 2:2}
        obj2 = {1:1, 2:2}
        self.maxDiff = None # sometimes is usefull
        self.assertDictEqual(d1, d2)

if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']

    unittest.main()

what about standart (compare FULL Object)

PyDev->new PyDev Module->Module: unittest

import unittest


class Test(unittest.TestCase):


    def testName(self):
        obj1 = {1:1, 2:2}
        obj2 = {1:1, 2:2}
        self.maxDiff = None # sometimes is usefull
        self.assertDictEqual(d1, d2)

if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']

    unittest.main()

回答 11

如果在Python≥2.7上:

# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise

for k in dictA.viewkeys() & dictB.viewkeys():
    if dictA[k] != dictB[k]:
        dictB[k]= dictA[k]

# add missing keys to dictA

dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )

If on Python ≥ 2.7:

# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise

for k in dictA.viewkeys() & dictB.viewkeys():
    if dictA[k] != dictB[k]:
        dictB[k]= dictA[k]

# add missing keys to dictA

dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )

回答 12

这是深度比较两个字典键的解决方案:

def compareDictKeys(dict1, dict2):
  if type(dict1) != dict or type(dict2) != dict:
      return False

  keys1, keys2 = dict1.keys(), dict2.keys()
  diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)

  if not diff:
      for key in keys1:
          if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
              diff = True
              break

  return not diff

Here is a solution for deep comparing 2 dictionaries keys:

def compareDictKeys(dict1, dict2):
  if type(dict1) != dict or type(dict2) != dict:
      return False

  keys1, keys2 = dict1.keys(), dict2.keys()
  diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)

  if not diff:
      for key in keys1:
          if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
              diff = True
              break

  return not diff

回答 13

这是一个可以比较两个以上命令的解决方案:

def diff_dict(dicts, default=None):
    diff_dict = {}
    # add 'list()' around 'd.keys()' for python 3 compatibility
    for k in set(sum([d.keys() for d in dicts], [])):
        # we can just use "values = [d.get(k, default) ..." below if 
        # we don't care that d1[k]=default and d2[k]=missing will
        # be treated as equal
        if any(k not in d for d in dicts):
            diff_dict[k] = [d.get(k, default) for d in dicts]
        else:
            values = [d[k] for d in dicts]
            if any(v != values[0] for v in values):
                diff_dict[k] = values
    return diff_dict

用法示例:

import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])

here’s a solution that can compare more than two dicts:

def diff_dict(dicts, default=None):
    diff_dict = {}
    # add 'list()' around 'd.keys()' for python 3 compatibility
    for k in set(sum([d.keys() for d in dicts], [])):
        # we can just use "values = [d.get(k, default) ..." below if 
        # we don't care that d1[k]=default and d2[k]=missing will
        # be treated as equal
        if any(k not in d for d in dicts):
            diff_dict[k] = [d.get(k, default) for d in dicts]
        else:
            values = [d[k] for d in dicts]
            if any(v != values[0] for v in values):
                diff_dict[k] = values
    return diff_dict

usage example:

import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])

回答 14

我的两个字典之间的对称差异的配方:

def find_dict_diffs(dict1, dict2):
    unequal_keys = []
    unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
    for k in dict1.keys():
        if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
            unequal_keys.append(k)
    if unequal_keys:
        print 'param', 'dict1\t', 'dict2'
        for k in set(unequal_keys):
            print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
    else:
        print 'Dicts are equal'

dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}

find_dict_diffs(dict1, dict2)

结果是:

param   dict1   dict2
1       a       b
2       b       a
5       e       N\A
6       N\A     f

My recipe of symmetric difference between two dictionaries:

def find_dict_diffs(dict1, dict2):
    unequal_keys = []
    unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
    for k in dict1.keys():
        if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
            unequal_keys.append(k)
    if unequal_keys:
        print 'param', 'dict1\t', 'dict2'
        for k in set(unequal_keys):
            print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
    else:
        print 'Dicts are equal'

dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}

find_dict_diffs(dict1, dict2)

And result is:

param   dict1   dict2
1       a       b
2       b       a
5       e       N\A
6       N\A     f

回答 15

正如其他答案中提到的那样,unittest可以生成一些不错的输出来比较dict,但是在此示例中,我们不需要先构建整个测试。

废弃unittest源代码,看起来您可以通过以下方式获得公平的解决方案:

import difflib
import pprint

def diff_dicts(a, b):
    if a == b:
        return ''
    return '\n'.join(
        difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
                      pprint.pformat(b, width=30).splitlines())
    )

所以

dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)

结果是:

{0: 112,
-  1: 121,
-  2: 116,
+  1: 'spam',
+  2: [1, 2, 3],
   3: 104,
-  4: 111,
?        ^

+  4: 111}
?        ^

-  5: 110}

哪里:

  • ‘-‘表示第一/第二个字典中的键/值
  • “ +”表示第二个而不是第一个字典中的键/值

像在单元测试中一样,唯一的警告是由于尾随逗号/括号,最终映射可以被认为是差异。

As mentioned in other answers, unittest produces some nice output for comparing dicts, but in this example we don’t want to have to build a whole test first.

Scraping the unittest source, it looks like you can get a fair solution with just this:

import difflib
import pprint

def diff_dicts(a, b):
    if a == b:
        return ''
    return '\n'.join(
        difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
                      pprint.pformat(b, width=30).splitlines())
    )

so

dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)

Results in:

{0: 112,
-  1: 121,
-  2: 116,
+  1: 'spam',
+  2: [1, 2, 3],
   3: 104,
-  4: 111,
?        ^

+  4: 111}
?        ^

-  5: 110}

Where:

  • ‘-‘ indicates key/values in the first but not second dict
  • ‘+’ indicates key/values in the second but not the first dict

Like in unittest, the only caveat is that the final mapping can be thought to be a diff, due to the trailing comma/bracket.


回答 16

@Maxx有一个很好的答案,请使用unittestPython提供的工具:

import unittest


class Test(unittest.TestCase):
    def runTest(self):
        pass

    def testDict(self, d1, d2, maxDiff=None):
        self.maxDiff = maxDiff
        self.assertDictEqual(d1, d2)

然后,您可以在代码中的任何位置调用:

try:
    Test().testDict(dict1, dict2)
except Exception, e:
    print e

结果输出看起来像来自的输出diff,用不同的行+或在-每行之前添加漂亮的字典。

@Maxx has an excellent answer, use the unittest tools provided by Python:

import unittest


class Test(unittest.TestCase):
    def runTest(self):
        pass

    def testDict(self, d1, d2, maxDiff=None):
        self.maxDiff = maxDiff
        self.assertDictEqual(d1, d2)

Then, anywhere in your code you can call:

try:
    Test().testDict(dict1, dict2)
except Exception, e:
    print e

The resulting output looks like the output from diff, pretty-printing the dictionaries with + or - prepending each line that is different.


回答 17

不知道它是否仍然有用,但是我遇到了这个问题,我的情况是我只需要返回所有嵌套字典等的变化的字典。找不到合适的解决方案,但是我最终写了一个简单的函数要做到这一点。希望这可以帮助,

Not sure if it is still relevant but I came across this problem, my situation i just needed to return a dictionary of the changes for all nested dictionaries etc etc. Could not find a good solution out there but I did end up writing a simple function to do this. Hope this helps,


回答 18

如果您想使用内置解决方案与任意dict结构进行全面比较,@ Maxx的答案就是一个好的开始。

import unittest

test = unittest.TestCase()
test.assertEqual(dictA, dictB)

If you want a built-in solution for a full comparison with arbitrary dict structures, @Maxx’s answer is a good start.

import unittest

test = unittest.TestCase()
test.assertEqual(dictA, dictB)

回答 19

根据ghostdog74的回答,

dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}

for value in dicta.values():
    if not value in dictb.values():
        print value

将打印不同的dicta值

Based on ghostdog74’s answer,

dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}

for value in dicta.values():
    if not value in dictb.values():
        print value

will print differ value of dicta


回答 20

尝试此操作以找到de交集,即两个字典中的键,如果要在第二个字典中找不到键,只需使用not in

intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())

Try this to find de intersection, the keys that is in both dictionarie, if you want the keys not found on second dictionarie, just use the not in

intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())

如何在字典理解中使用if / else?

问题:如何在字典理解中使用if / else?

Python 2.7+中是否存在一种类似于以下内容的方法?

{ something_if_true if condition else something_if_false for key, value in dict_.items() }

我知道您只要使用’if’就可以做任何事情:

{ something_if_true for key, value in dict_.items() if condition}

Does there exist a way in Python 2.7+ to make something like the following?

{ something_if_true if condition else something_if_false for key, value in dict_.items() }

I know you can make anything with just ‘if’:

{ something_if_true for key, value in dict_.items() if condition}

回答 0

您已经知道了:A if test else B是有效的Python表达式。所示的dict理解的唯一问题是dict理解中表达式的位置必须有两个表达式,并用冒号分隔:

{ (some_key if condition else default_key):(something_if_true if condition
          else something_if_false) for key, value in dict_.items() }

final if子句充当过滤器,这与具有条件表达式不同。

You’ve already got it: A if test else B is a valid Python expression. The only problem with your dict comprehension as shown is that the place for an expression in a dict comprehension must have two expressions, separated by a colon:

{ (some_key if condition else default_key):(something_if_true if condition
          else something_if_false) for key, value in dict_.items() }

The final if clause acts as a filter, which is different from having the conditional expression.


回答 1

@Marcin的答案涵盖了所有内容,但是如果有人想看一个实际的示例,我在下面添加两个:

假设您有以下集合的字典

d = {'key1': {'a', 'b', 'c'}, 'key2': {'foo', 'bar'}, 'key3': {'so', 'sad'}}

并且您想创建一个新字典,其字典中的键指示值中是否包含字符串,'a'则可以使用

dout = {"a_in_values_of_{}".format(k) if 'a' in v else "a_not_in_values_of_{}".format(k): v for k, v in d.items()}

产生

{'a_in_values_of_key1': {'a', 'b', 'c'},
 'a_not_in_values_of_key2': {'bar', 'foo'},
 'a_not_in_values_of_key3': {'sad', 'so'}}

现在假设您有两个这样的字典

d1 = {'bad_key1': {'a', 'b', 'c'}, 'bad_key2': {'foo', 'bar'}, 'bad_key3': {'so', 'sad'}}
d2 = {'good_key1': {'foo', 'bar', 'xyz'}, 'good_key2': {'a', 'b', 'c'}}

如果您要替换的键中d1d2值相同,则可以

# here we assume that the values in d2 are unique
# Python 2
dout2 = {d2.keys()[d2.values().index(v1)] if v1 in d2.values() else k1: v1 for k1, v1 in d1.items()}

# Python 3
dout2 = {list(d2.keys())[list(d2.values()).index(v1)] if v1 in d2.values() else k1: v1 for k1, v1 in d1.items()}

这使

{'bad_key2': {'bar', 'foo'},
 'bad_key3': {'sad', 'so'},
 'good_key2': {'a', 'b', 'c'}}

@Marcin’s answer covers it all, but just in case someone wants to see an actual example, I add two below:

Let’s say you have the following dictionary of sets

d = {'key1': {'a', 'b', 'c'}, 'key2': {'foo', 'bar'}, 'key3': {'so', 'sad'}}

and you want to create a new dictionary whose keys indicate whether the string 'a' is contained in the values or not, you can use

dout = {"a_in_values_of_{}".format(k) if 'a' in v else "a_not_in_values_of_{}".format(k): v for k, v in d.items()}

which yields

{'a_in_values_of_key1': {'a', 'b', 'c'},
 'a_not_in_values_of_key2': {'bar', 'foo'},
 'a_not_in_values_of_key3': {'sad', 'so'}}

Now let’s suppose you have two dictionaries like this

d1 = {'bad_key1': {'a', 'b', 'c'}, 'bad_key2': {'foo', 'bar'}, 'bad_key3': {'so', 'sad'}}
d2 = {'good_key1': {'foo', 'bar', 'xyz'}, 'good_key2': {'a', 'b', 'c'}}

and you want to replace the keys in d1 by the keys of d2 if there respective values are identical, you could do

# here we assume that the values in d2 are unique
# Python 2
dout2 = {d2.keys()[d2.values().index(v1)] if v1 in d2.values() else k1: v1 for k1, v1 in d1.items()}

# Python 3
dout2 = {list(d2.keys())[list(d2.values()).index(v1)] if v1 in d2.values() else k1: v1 for k1, v1 in d1.items()}

which gives

{'bad_key2': {'bar', 'foo'},
 'bad_key3': {'sad', 'so'},
 'good_key2': {'a', 'b', 'c'}}

回答 2

如果您有不同的条件来评估键和值,@ Marcin的答案就是解决方法。

如果键和值的条件相同,则最好在生成器表达式中构建(键,值)元组并馈入dict()

dict((modify_k(k), modify_v(v)) if condition else (k, v) for k, v in dct.items())

它更易于阅读,并且每个键值仅对条件进行一次评估。

借用@Cleb的集合字典的示例:

d = {'key1': {'a', 'b', 'c'}, 'key2': {'foo', 'bar'}, 'key3': {'so', 'sad'}}

假设您只想在其后缀keys,并且在这种情况下,您希望将其替换为集合的长度。否则,键值对应保持不变。avaluevalue

dict((f"{k}_a", len(v)) if "a" in v else (k, v) for k, v in d.items())
# {'key1_a': 3, 'key2': {'bar', 'foo'}, 'key3': {'sad', 'so'}}

In case you have different conditions to evaluate for keys and values, @Marcin’s answer is the way to go.

If you have the same condition for keys and values, you’re better off with building (key, value)-tuples in a generator-expression feeding into dict():

dict((modify_k(k), modify_v(v)) if condition else (k, v) for k, v in dct.items())

It’s easier to read and the condition is only evaluated once per key, value.

Example with borrowing @Cleb’s dictionary of sets:

d = {'key1': {'a', 'b', 'c'}, 'key2': {'foo', 'bar'}, 'key3': {'so', 'sad'}}

Assume you want to suffix only keys with a in its value and you want the value replaced with the length of the set in such a case. Otherwise, the key-value pair should stay unchanged.

dict((f"{k}_a", len(v)) if "a" in v else (k, v) for k, v in d.items())
# {'key1_a': 3, 'key2': {'bar', 'foo'}, 'key3': {'sad', 'so'}}

回答 3

在字典理解中使用if / else的另一个示例

我正在为自己的办公室工作在数据输入桌面应用程序上,这种数据输入应用程序通常会从输入小部件中获取所有条目并将其转储到字典中,以进行进一步的处理,例如验证或编辑,我们必须将其返回选择的数据从文件返回到条目小部件等

第一轮使用传统编码(8行):

entries = {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}

a_dic, b_dic = {}, {}

for field, value in entries.items():
    if field == 'ther':
        for k,v in value.items():
            b_dic[k] = v
        a_dic[field] = b_dic
    else:
        a_dic[field] = value
    
print(a_dic)
 {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}”

第二轮我尝试使用字典理解,但是循环仍然存在(6行):

entries = {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}

for field, value in entries.items():
    if field == 'ther':
        b_dic = {k:v for k,v in value.items()}
        a_dic[field] = b_dic
    else:
        a_dic[field] = value
    
print(a_dic)
 {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}”

最后,使用单行字典理解语句(1行):

entries = {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}

a_dic = {field:{k:v for k,v in value.items()} if field == 'ther' 
        else value for field, value in entries.items()}
    
print(a_dic)
 {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}”

我使用python 3.8.3

Another example in using if/else in dictionary comprehension

I am working on data-entry desktop application for my own office work, and it is common for such data-entry application to get all entries from input widget and dump it into a dictionary for further processing like validation, or editing which we must return selected data from file back to entry widgets, etc.

The first round using traditional coding (8 lines):

entries = {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}

a_dic, b_dic = {}, {}

for field, value in entries.items():
    if field == 'ther':
        for k,v in value.items():
            b_dic[k] = v
        a_dic[field] = b_dic
    else:
        a_dic[field] = value
    
print(a_dic)
“ {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}”

Second round I tried to use dictionary comprehension but the loop still there (6 lines):

entries = {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}

for field, value in entries.items():
    if field == 'ther':
        b_dic = {k:v for k,v in value.items()}
        a_dic[field] = b_dic
    else:
        a_dic[field] = value
    
print(a_dic)
“ {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}”

Finally, with a one-line dictionary comprehension statement (1 line):

entries = {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}

a_dic = {field:{k:v for k,v in value.items()} if field == 'ther' 
        else value for field, value in entries.items()}
    
print(a_dic)
“ {'name': 'Material Name', 'maxt': 'Max Working Temperature', 'ther': {100: 1.1, 200: 1.2}}”

I use python 3.8.3


创建两个熊猫数据框列的字典的最有效方法是什么?

问题:创建两个熊猫数据框列的字典的最有效方法是什么?

组织以下熊猫数据框的最有效方法是什么:

数据=

Position    Letter
1           a
2           b
3           c
4           d
5           e

变成字典一样alphabet[1 : 'a', 2 : 'b', 3 : 'c', 4 : 'd', 5 : 'e']

What is the most efficient way to organise the following pandas Dataframe:

data =

Position    Letter
1           a
2           b
3           c
4           d
5           e

into a dictionary like alphabet[1 : 'a', 2 : 'b', 3 : 'c', 4 : 'd', 5 : 'e']?


回答 0

In [9]: pd.Series(df.Letter.values,index=df.Position).to_dict()
Out[9]: {1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}

速度比较(使用Wouter方法)

In [6]: df = pd.DataFrame(randint(0,10,10000).reshape(5000,2),columns=list('AB'))

In [7]: %timeit dict(zip(df.A,df.B))
1000 loops, best of 3: 1.27 ms per loop

In [8]: %timeit pd.Series(df.A.values,index=df.B).to_dict()
1000 loops, best of 3: 987 us per loop
In [9]: pd.Series(df.Letter.values,index=df.Position).to_dict()
Out[9]: {1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}

Speed comparion (using Wouter’s method)

In [6]: df = pd.DataFrame(randint(0,10,10000).reshape(5000,2),columns=list('AB'))

In [7]: %timeit dict(zip(df.A,df.B))
1000 loops, best of 3: 1.27 ms per loop

In [8]: %timeit pd.Series(df.A.values,index=df.B).to_dict()
1000 loops, best of 3: 987 us per loop

回答 1

我找到了解决问题的更快方法,至少在使用以下方法的大型数据集上: df.set_index(KEY).to_dict()[VALUE]

50,000行的证明:

df = pd.DataFrame(np.random.randint(32, 120, 100000).reshape(50000,2),columns=list('AB'))
df['A'] = df['A'].apply(chr)

%timeit dict(zip(df.A,df.B))
%timeit pd.Series(df.A.values,index=df.B).to_dict()
%timeit df.set_index('A').to_dict()['B']

输出:

100 loops, best of 3: 7.04 ms per loop  # WouterOvermeire
100 loops, best of 3: 9.83 ms per loop  # Jeff
100 loops, best of 3: 4.28 ms per loop  # Kikohs (me)

I found a faster way to solve the problem, at least on realistically large datasets using: df.set_index(KEY).to_dict()[VALUE]

Proof on 50,000 rows:

df = pd.DataFrame(np.random.randint(32, 120, 100000).reshape(50000,2),columns=list('AB'))
df['A'] = df['A'].apply(chr)

%timeit dict(zip(df.A,df.B))
%timeit pd.Series(df.A.values,index=df.B).to_dict()
%timeit df.set_index('A').to_dict()['B']

Output:

100 loops, best of 3: 7.04 ms per loop  # WouterOvermeire
100 loops, best of 3: 9.83 ms per loop  # Jeff
100 loops, best of 3: 4.28 ms per loop  # Kikohs (me)

回答 2

在Python 3.6中,最快的方法仍然是WouterOvermeire。Kikohs的提议比其他两个方案要慢。

import timeit

setup = '''
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(32, 120, 100000).reshape(50000,2),columns=list('AB'))
df['A'] = df['A'].apply(chr)
'''

timeit.Timer('dict(zip(df.A,df.B))', setup=setup).repeat(7,500)
timeit.Timer('pd.Series(df.A.values,index=df.B).to_dict()', setup=setup).repeat(7,500)
timeit.Timer('df.set_index("A").to_dict()["B"]', setup=setup).repeat(7,500)

结果:

1.1214002349999777 s  # WouterOvermeire
1.1922008498571748 s  # Jeff
1.7034366211428602 s  # Kikohs

In Python 3.6 the fastest way is still the WouterOvermeire one. Kikohs’ proposal is slower than the other two options.

import timeit

setup = '''
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(32, 120, 100000).reshape(50000,2),columns=list('AB'))
df['A'] = df['A'].apply(chr)
'''

timeit.Timer('dict(zip(df.A,df.B))', setup=setup).repeat(7,500)
timeit.Timer('pd.Series(df.A.values,index=df.B).to_dict()', setup=setup).repeat(7,500)
timeit.Timer('df.set_index("A").to_dict()["B"]', setup=setup).repeat(7,500)

Results:

1.1214002349999777 s  # WouterOvermeire
1.1922008498571748 s  # Jeff
1.7034366211428602 s  # Kikohs

回答 3

TL; DR

>>> import pandas as pd
>>> df = pd.DataFrame({'Position':[1,2,3,4,5], 'Letter':['a', 'b', 'c', 'd', 'e']})
>>> dict(sorted(df.values.tolist())) # Sort of sorted... 
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
>>> from collections import OrderedDict
>>> OrderedDict(df.values.tolist())
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)])

在长

解决方案说明: dict(sorted(df.values.tolist()))

鉴于:

df = pd.DataFrame({'Position':[1,2,3,4,5], 'Letter':['a', 'b', 'c', 'd', 'e']})

[出]:

 Letter Position
0   a   1
1   b   2
2   c   3
3   d   4
4   e   5

尝试:

# Get the values out to a 2-D numpy array, 
df.values

[出]:

array([['a', 1],
       ['b', 2],
       ['c', 3],
       ['d', 4],
       ['e', 5]], dtype=object)

然后(可选):

# Dump it into a list so that you can sort it using `sorted()`
sorted(df.values.tolist()) # Sort by key

要么:

# Sort by value:
from operator import itemgetter
sorted(df.values.tolist(), key=itemgetter(1))

[出]:

[['a', 1], ['b', 2], ['c', 3], ['d', 4], ['e', 5]]

最后,将2个元素的列表转换成字典。

dict(sorted(df.values.tolist())) 

[出]:

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

有关

回答@sbradbio评论:

如果一个特定的键有多个值,而您想保留所有值,那么这不是最有效,但最直观的方法是:

from collections import defaultdict
import pandas as pd

multivalue_dict = defaultdict(list)

df = pd.DataFrame({'Position':[1,2,4,4,4], 'Letter':['a', 'b', 'd', 'e', 'f']})

for idx,row in df.iterrows():
    multivalue_dict[row['Position']].append(row['Letter'])

[出]:

>>> print(multivalue_dict)
defaultdict(list, {1: ['a'], 2: ['b'], 4: ['d', 'e', 'f']})

TL;DR

>>> import pandas as pd
>>> df = pd.DataFrame({'Position':[1,2,3,4,5], 'Letter':['a', 'b', 'c', 'd', 'e']})
>>> dict(sorted(df.values.tolist())) # Sort of sorted... 
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
>>> from collections import OrderedDict
>>> OrderedDict(df.values.tolist())
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)])

In Long

Explaining solution: dict(sorted(df.values.tolist()))

Given:

df = pd.DataFrame({'Position':[1,2,3,4,5], 'Letter':['a', 'b', 'c', 'd', 'e']})

[out]:

 Letter Position
0   a   1
1   b   2
2   c   3
3   d   4
4   e   5

Try:

# Get the values out to a 2-D numpy array, 
df.values

[out]:

array([['a', 1],
       ['b', 2],
       ['c', 3],
       ['d', 4],
       ['e', 5]], dtype=object)

Then optionally:

# Dump it into a list so that you can sort it using `sorted()`
sorted(df.values.tolist()) # Sort by key

Or:

# Sort by value:
from operator import itemgetter
sorted(df.values.tolist(), key=itemgetter(1))

[out]:

[['a', 1], ['b', 2], ['c', 3], ['d', 4], ['e', 5]]

Lastly, cast the list of list of 2 elements into a dict.

dict(sorted(df.values.tolist())) 

[out]:

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

Related

Answering @sbradbio comment:

If there are multiple values for a specific key and you would like to keep all of them, it’s the not the most efficient but the most intuitive way is:

from collections import defaultdict
import pandas as pd

multivalue_dict = defaultdict(list)

df = pd.DataFrame({'Position':[1,2,4,4,4], 'Letter':['a', 'b', 'd', 'e', 'f']})

for idx,row in df.iterrows():
    multivalue_dict[row['Position']].append(row['Letter'])

[out]:

>>> print(multivalue_dict)
defaultdict(list, {1: ['a'], 2: ['b'], 4: ['d', 'e', 'f']})

迭代对应于Python中列表的字典键值

问题:迭代对应于Python中列表的字典键值

使用Python 2.7。我有一本字典,其中以球队名称为关键,对每支球队得分并允许的奔跑次数作为值列表:

NL_East = {'Phillies': [645, 469], 'Braves': [599, 548], 'Mets': [653, 672]}

我希望能够将字典提供给函数并遍历每个团队(键)。

这是我正在使用的代码。现在,我只能逐队参加。我将如何遍历每个团队并为每个团队打印预期的win_percentage?

def Pythag(league):
    runs_scored = float(league['Phillies'][0])
    runs_allowed = float(league['Phillies'][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

谢谢你的帮助。

Working in Python 2.7. I have a dictionary with team names as the keys and the amount of runs scored and allowed for each team as the value list:

NL_East = {'Phillies': [645, 469], 'Braves': [599, 548], 'Mets': [653, 672]}

I would like to be able to feed the dictionary into a function and iterate over each team (the keys).

Here’s the code I’m using. Right now, I can only go team by team. How would I iterate over each team and print the expected win_percentage for each team?

def Pythag(league):
    runs_scored = float(league['Phillies'][0])
    runs_allowed = float(league['Phillies'][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

Thanks for any help.


回答 0

您有几种选择可以遍历字典。

如果迭代字典本身(for team in league),则将迭代字典的键。当使用for循环进行循环时,无论您是在dict(league)本身上循环还是在以下情况下,行为都是相同的league.keys()

for team in league.keys():
    runs_scored, runs_allowed = map(float, league[team])

您还可以通过迭代遍历键和值一次league.items()

for team, runs in league.items():
    runs_scored, runs_allowed = map(float, runs)

您甚至可以在迭代时执行元组拆包:

for team, (runs_scored, runs_allowed) in league.items():
    runs_scored = float(runs_scored)
    runs_allowed = float(runs_allowed)

You have several options for iterating over a dictionary.

If you iterate over the dictionary itself (for team in league), you will be iterating over the keys of the dictionary. When looping with a for loop, the behavior will be the same whether you loop over the dict (league) itself, or league.keys():

for team in league.keys():
    runs_scored, runs_allowed = map(float, league[team])

You can also iterate over both the keys and the values at once by iterating over league.items():

for team, runs in league.items():
    runs_scored, runs_allowed = map(float, runs)

You can even perform your tuple unpacking while iterating:

for team, (runs_scored, runs_allowed) in league.items():
    runs_scored = float(runs_scored)
    runs_allowed = float(runs_allowed)

回答 1

您也可以很容易地遍历字典:

for team, scores in NL_East.iteritems():
    runs_scored = float(scores[0])
    runs_allowed = float(scores[1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print '%s: %.1f%%' % (team, win_percentage)

You can very easily iterate over dictionaries, too:

for team, scores in NL_East.iteritems():
    runs_scored = float(scores[0])
    runs_allowed = float(scores[1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print '%s: %.1f%%' % (team, win_percentage)

回答 2

字典具有一个称为的内置函数iterkeys()

尝试:

for team in league.iterkeys():
    runs_scored = float(league[team][0])
    runs_allowed = float(league[team][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

Dictionaries have a built in function called iterkeys().

Try:

for team in league.iterkeys():
    runs_scored = float(league[team][0])
    runs_allowed = float(league[team][1])
    win_percentage = round((runs_scored**2)/((runs_scored**2)+(runs_allowed**2))*1000)
    print win_percentage

回答 3

字典对象允许您迭代其项目。此外,通过模式匹配和__future__可以使事情稍微简化。

最后,您可以将逻辑从打印中分离出来,以使事情在以后的重构/调试中更加容易。

from __future__ import division

def Pythag(league):
    def win_percentages():
        for team, (runs_scored, runs_allowed) in league.iteritems():
            win_percentage = round((runs_scored**2) / ((runs_scored**2)+(runs_allowed**2))*1000)
            yield win_percentage

    for win_percentage in win_percentages():
        print win_percentage

Dictionary objects allow you to iterate over their items. Also, with pattern matching and the division from __future__ you can do simplify things a bit.

Finally, you can separate your logic from your printing to make things a bit easier to refactor/debug later.

from __future__ import division

def Pythag(league):
    def win_percentages():
        for team, (runs_scored, runs_allowed) in league.iteritems():
            win_percentage = round((runs_scored**2) / ((runs_scored**2)+(runs_allowed**2))*1000)
            yield win_percentage

    for win_percentage in win_percentages():
        print win_percentage

回答 4

列表理解可以缩短内容…

win_percentages = [m**2.0 / (m**2.0 + n**2.0) * 100 for m, n in [a[i] for i in NL_East]]

List comprehension can shorten things…

win_percentages = [m**2.0 / (m**2.0 + n**2.0) * 100 for m, n in [a[i] for i in NL_East]]

在Python中遍历字典时,为什么必须调用.items()?

问题:在Python中遍历字典时,为什么必须调用.items()?

为什么需要调用items()以遍历字典中的键,值对?即。

dic = {'one': '1', 'two': '2'}
for k, v in dic.items():
    print(k, v)

为什么不是在字典上进行迭代的默认行为

for k, v in dic:
    print(k, v)

Why do you have to call items() to iterate over key, value pairs in a dictionary? ie.

dic = {'one': '1', 'two': '2'}
for k, v in dic.items():
    print(k, v)

Why isn’t that the default behavior of iterating over a dictionary

for k, v in dic:
    print(k, v)

回答 0

对于每个python容器C,期望是

for item in C:
    assert item in C

会顺利通过- 如果一种感觉(循环子句)与另一种感觉(存在检查)完全不同,会不会感到惊讶in?我一定会的!它自然适用于列表,集合,元组,…

因此,当C是一个字典时,如果infor循环生成键/值元组,那么,根据最小惊讶的原理,in还必须将其元组作为其包含检查中的左操作数。

那会有用吗?好看不中用的确,基本上做if (key, value) in C的代名词if C.get(key) == value-这是一张支票,我相信我可能已经执行,或要执行,100倍以上的很少比if k in C实际手段,检查钥匙的存在,完全无视值。

另一方面,只在键上循环很常见,例如:

for k in thedict:
    thedict[k] += 1

拥有价值也无济于事:

for k, v in thedict.items():
    thedict[k] = v + 1

实际上有点不太清晰和简洁。(请注意,这items是用于获取键/值对的“正确”方法的原始拼写:不幸的是,这是在此类访问器返回整个列表的时代,因此,为了支持“公正迭代”,必须引入替代拼写,并且iteritems 在-Python 3中,与以前的Python版本的向后兼容性约束被大大削弱,后来items又变成了)。

For every python container C, the expectation is that

for item in C:
    assert item in C

will pass just fine — wouldn’t you find it astonishing if one sense of in (the loop clause) had a completely different meaning from the other (the presence check)? I sure would! It naturally works that way for lists, sets, tuples, …

So, when C is a dictionary, if in were to yield key/value tuples in a for loop, then, by the principle of least astonishment, in would also have to take such a tuple as its left-hand operand in the containment check.

How useful would that be? Pretty useless indeed, basically making if (key, value) in C a synonym for if C.get(key) == value — which is a check I believe I may have performed, or wanted to perform, 100 times more rarely than what if k in C actually means, checking the presence of the key only and completely ignoring the value.

On the other hand, wanting to loop just on keys is quite common, e.g.:

for k in thedict:
    thedict[k] += 1

having the value as well would not help particularly:

for k, v in thedict.items():
    thedict[k] = v + 1

actually somewhat less clear and less concise. (Note that items was the original spelling of the “proper” methods to use to get key/value pairs: unfortunately that was back in the days when such accessors returned whole lists, so to support “just iterating” an alternative spelling had to be introduced, and iteritems it was — in Python 3, where backwards compatibility constraints with previous Python versions were much weakened, it became items again).


回答 1

我的猜测:使用完整的元组进行循环会更直观,但使用进行成员资格测试可能会更不那么直观in

if key in counts:
    counts[key] += 1
else:
    counts[key] = 1

如果您必须同时指定key和value,那么该代码将无法正常工作in。我很难想象用例,您将检查键和值是否都在字典中。仅测试密钥更为自然。

# When would you ever write a condition like this?
if (key, value) in dict:

现在,in操作员和for ... in操作员不必对相同的项目进行操作。在实现方面,它们是不同的操作(__contains__vs. __iter__)。但是,这种微小的不一致会造成一些混乱,而且不一致。

My guess: Using the full tuple would be more intuitive for looping, but perhaps less so for testing for membership using in.

if key in counts:
    counts[key] += 1
else:
    counts[key] = 1

That code wouldn’t really work if you had to specify both key and value for in. I am having a hard time imagining use case where you’d check if both the key AND value are in the dictionary. It is far more natural to only test the keys.

# When would you ever write a condition like this?
if (key, value) in dict:

Now it’s not necessary that the in operator and for ... in operate over the same items. Implementation-wise they are different operations (__contains__ vs. __iter__). But that little inconsistency would be somewhat confusing and, well, inconsistent.


在Python3中按索引访问dict_keys元素

问题:在Python3中按索引访问dict_keys元素

我正在尝试通过其索引访问dict_key的元素:

test = {'foo': 'bar', 'hello': 'world'}
keys = test.keys()  # dict_keys object

keys.index(0)
AttributeError: 'dict_keys' object has no attribute 'index'

我想得到foo

与:

keys[0]
TypeError: 'dict_keys' object does not support indexing

我怎样才能做到这一点?

I’m trying to access a dict_key’s element by its index:

test = {'foo': 'bar', 'hello': 'world'}
keys = test.keys()  # dict_keys object

keys.index(0)
AttributeError: 'dict_keys' object has no attribute 'index'

I want to get foo.

same with:

keys[0]
TypeError: 'dict_keys' object does not support indexing

How can I do this?


回答 0

list()而是调用字典:

keys = list(test)

在Python 3中,该dict.keys()方法返回一个字典视图对象,它作为一个集合。直接迭代字典也会产生键,因此将字典转换为列表会得到所有键的列表:

>>> test = {'foo': 'bar', 'hello': 'world'}
>>> list(test)
['foo', 'hello']
>>> list(test)[0]
'foo'

Call list() on the dictionary instead:

keys = list(test)

In Python 3, the dict.keys() method returns a dictionary view object, which acts as a set. Iterating over the dictionary directly also yields keys, so turning a dictionary into a list results in a list of all the keys:

>>> test = {'foo': 'bar', 'hello': 'world'}
>>> list(test)
['foo', 'hello']
>>> list(test)[0]
'foo'

回答 1

不是完整的答案,但可能是有用的提示。如果它确实是您想要的第一项*,那么

next(iter(q))

比快得多

list(q)[0]

对于大词典,因为不必将整个内容存储在内存中。

对于10.000.000物品,我发现它快了将近40.000倍。

*如果dict只是Python 3.6之前的伪随机项目,则第一项(此后它在标准实现中已订购,尽管不建议依赖它)。

Not a full answer but perhaps a useful hint. If it is really the first item you want*, then

next(iter(q))

is much faster than

list(q)[0]

for large dicts, since the whole thing doesn’t have to be stored in memory.

For 10.000.000 items I found it to be almost 40.000 times faster.

*The first item in case of a dict being just a pseudo-random item before Python 3.6 (after that it’s ordered in the standard implementation, although it’s not advised to rely on it).


回答 2

我想要第一个字典项的“键”和“值”对。我用下面的代码。

 key, val = next(iter(my_dict.items()))

I wanted “key” & “value” pair of a first dictionary item. I used the following code.

 key, val = next(iter(my_dict.items()))

回答 3

test = {'foo': 'bar', 'hello': 'world'}
ls = []
for key in test.keys():
    ls.append(key)
print(ls[0])

将键附加到静态定义的列表然后对其进行索引的常规方式

test = {'foo': 'bar', 'hello': 'world'}
ls = []
for key in test.keys():
    ls.append(key)
print(ls[0])

Conventional way of appending the keys to a statically defined list and then indexing it for same


回答 4

在许多情况下,这可能是XY问题。为什么要按位置索引字典键?您真的需要吗?直到最近,字典甚至还没有在Python中排序,因此访问第一个元素是任意的。

我刚刚将一些Python 2代码翻译为Python 3:

keys = d.keys()
for (i, res) in enumerate(some_list):
    k = keys[i]
    # ...

这不是很漂亮,但也不是很糟糕。起初,我正要用可怕的东西代替它

    k = next(itertools.islice(iter(keys), i, None))

在我意识到这一切写成更好之前

for (k, res) in zip(d.keys(), some_list):

效果很好。

我相信在许多其他情况下,可以避免按位置索引字典关键字。尽管字典在Python 3.7中是有序的,但是依靠它并不是很漂亮。上面的代码仅起作用,因为的内容some_list是最近从的内容中产生的d

如果您确实需要disk_keys按索引访问元素,请仔细看一下代码。也许您不需要。

In many cases, this may be an XY Problem. Why are you indexing your dictionary keys by position? Do you really need to? Until recently, dictionaries were not even ordered in Python, so accessing the first element was arbitrary.

I just translated some Python 2 code to Python 3:

keys = d.keys()
for (i, res) in enumerate(some_list):
    k = keys[i]
    # ...

which is not pretty, but not very bad either. At first, I was about to replace it by the monstrous

    k = next(itertools.islice(iter(keys), i, None))

before I realised this is all much better written as

for (k, res) in zip(d.keys(), some_list):

which works just fine.

I believe that in many other cases, indexing dictionary keys by position can be avoided. Although dictionaries are ordered in Python 3.7, relying on that is not pretty. The code above only works because the contents of some_list had been recently produced from the contents of d.

Have a hard look at your code if you really need to access a disk_keys element by index. Perhaps you don’t need to.


回答 5

试试这个

keys = [next(iter(x.keys())) for x in test]
print(list(keys))

结果看起来像这样。[‘foo’,’hello’]

您可以在此处找到更多可能的解决方案。

Try this

keys = [next(iter(x.keys())) for x in test]
print(list(keys))

The result looks like this. [‘foo’, ‘hello’]

You can find more possible solutions here.


如何合并字典的字典?

问题:如何合并字典的字典?

我需要合并多个词典,例如:

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}

dict2 = {2:{"c":{C}}, 3:{"d":{D}}

随着A B CD作为树的叶子像{"info1":"value", "info2":"value2"}

词典的级别(深度)未知,可能是 {2:{"c":{"z":{"y":{C}}}}}

在我的情况下,它表示目录/文件结构,其中节点为docs,而节点为文件。

我想将它们合并以获得:

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

我不确定如何使用Python轻松做到这一点。

I need to merge multiple dictionaries, here’s what I have for instance:

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}

dict2 = {2:{"c":{C}}, 3:{"d":{D}}

With A B C and D being leaves of the tree, like {"info1":"value", "info2":"value2"}

There is an unknown level(depth) of dictionaries, it could be {2:{"c":{"z":{"y":{C}}}}}

In my case it represents a directory/files structure with nodes being docs and leaves being files.

I want to merge them to obtain:

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

I’m not sure how I could do that easily with Python.


回答 0

这实际上是非常棘手的-特别是如果您希望在事物不一致时收到有用的错误消息,同时正确地接受重复但一致的条目(这里没有其他答案了……)。

假设您没有大量条目,那么递归函数最简单:

def merge(a, b, path=None):
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})

请注意,这会发生变化a-将的内容b添加到a(也会返回)。如果您想保留a,可以这样称呼merge(dict(a), b)

agf指出(如下),您可能有两个以上的命令,在这种情况下,您可以使用:

reduce(merge, [dict1, dict2, dict3...])

将所有内容添加到dict1中。

[注意-我编辑了最初的答案以使第一个参数发生变化;使“减少”更易于解释]

python 3中的ps,您还需要 from functools import reduce

this is actually quite tricky – particularly if you want a useful error message when things are inconsistent, while correctly accepting duplicate but consistent entries (something no other answer here does….)

assuming you don’t have huge numbers of entries a recursive function is easiest:

def merge(a, b, path=None):
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})

note that this mutates a – the contents of b are added to a (which is also returned). if you want to keep a you could call it like merge(dict(a), b).

agf pointed out (below) that you may have more than two dicts, in which case you can use:

reduce(merge, [dict1, dict2, dict3...])

where everything will be added to dict1.

[note – i edited my initial answer to mutate the first argument; that makes the “reduce” easier to explain]

ps in python 3, you will also need from functools import reduce


回答 1

这是使用生成器实现的一种简单方法:

def mergedicts(dict1, dict2):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
                yield (k, dict(mergedicts(dict1[k], dict2[k])))
            else:
                # If one of the values is not a dict, you can't continue merging it.
                # Value from second dict overrides one in first and we move on.
                yield (k, dict2[k])
                # Alternatively, replace this with exception raiser to alert you of value conflicts
        elif k in dict1:
            yield (k, dict1[k])
        else:
            yield (k, dict2[k])

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}

print dict(mergedicts(dict1,dict2))

打印:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

Here’s an easy way to do it using generators:

def mergedicts(dict1, dict2):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
                yield (k, dict(mergedicts(dict1[k], dict2[k])))
            else:
                # If one of the values is not a dict, you can't continue merging it.
                # Value from second dict overrides one in first and we move on.
                yield (k, dict2[k])
                # Alternatively, replace this with exception raiser to alert you of value conflicts
        elif k in dict1:
            yield (k, dict1[k])
        else:
            yield (k, dict2[k])

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}

print dict(mergedicts(dict1,dict2))

This prints:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

回答 2

这个问题的一个问题是,字典的值可以是任意复杂的数据。基于这些和其他答案,我想到了以下代码:

class YamlReaderError(Exception):
    pass

def data_merge(a, b):
    """merges b into a and return merged result

    NOTE: tuples and arbitrary objects are not handled as it is totally ambiguous what should happen"""
    key = None
    # ## debug output
    # sys.stderr.write("DEBUG: %s to %s\n" %(b,a))
    try:
        if a is None or isinstance(a, str) or isinstance(a, unicode) or isinstance(a, int) or isinstance(a, long) or isinstance(a, float):
            # border case for first run or if a is a primitive
            a = b
        elif isinstance(a, list):
            # lists can be only appended
            if isinstance(b, list):
                # merge lists
                a.extend(b)
            else:
                # append to list
                a.append(b)
        elif isinstance(a, dict):
            # dicts must be merged
            if isinstance(b, dict):
                for key in b:
                    if key in a:
                        a[key] = data_merge(a[key], b[key])
                    else:
                        a[key] = b[key]
            else:
                raise YamlReaderError('Cannot merge non-dict "%s" into dict "%s"' % (b, a))
        else:
            raise YamlReaderError('NOT IMPLEMENTED "%s" into "%s"' % (b, a))
    except TypeError, e:
        raise YamlReaderError('TypeError "%s" in key "%s" when merging "%s" into "%s"' % (e, key, b, a))
    return a

我的用例是合并YAML文件,我只需要处理可能的数据类型的子集。因此,我可以忽略元组和其他对象。对我而言,明智的合并逻辑意味着

  • 更换标量
  • 追加清单
  • 通过添加缺少的键并更新现有键来合并字典

其他所有和不可预见的结果都会导致错误。

One issue with this question is that the values of the dict can be arbitrarily complex pieces of data. Based upon these and other answers I came up with this code:

class YamlReaderError(Exception):
    pass

def data_merge(a, b):
    """merges b into a and return merged result

    NOTE: tuples and arbitrary objects are not handled as it is totally ambiguous what should happen"""
    key = None
    # ## debug output
    # sys.stderr.write("DEBUG: %s to %s\n" %(b,a))
    try:
        if a is None or isinstance(a, str) or isinstance(a, unicode) or isinstance(a, int) or isinstance(a, long) or isinstance(a, float):
            # border case for first run or if a is a primitive
            a = b
        elif isinstance(a, list):
            # lists can be only appended
            if isinstance(b, list):
                # merge lists
                a.extend(b)
            else:
                # append to list
                a.append(b)
        elif isinstance(a, dict):
            # dicts must be merged
            if isinstance(b, dict):
                for key in b:
                    if key in a:
                        a[key] = data_merge(a[key], b[key])
                    else:
                        a[key] = b[key]
            else:
                raise YamlReaderError('Cannot merge non-dict "%s" into dict "%s"' % (b, a))
        else:
            raise YamlReaderError('NOT IMPLEMENTED "%s" into "%s"' % (b, a))
    except TypeError, e:
        raise YamlReaderError('TypeError "%s" in key "%s" when merging "%s" into "%s"' % (e, key, b, a))
    return a

My use case is merging YAML files where I only have to deal with a subset of possible data types. Hence I can ignore tuples and other objects. For me a sensible merge logic means

  • replace scalars
  • append lists
  • merge dicts by adding missing keys and updating existing keys

Everything else and the unforeseens results in an error.


回答 3

字典词典合并

由于这是规范的问题(尽管有某些非一般性的规定),所以我提供了规范的Python方法来解决此问题。

最简单的情况:“叶是嵌套的字典,以空字典结尾”:

d1 = {'a': {1: {'foo': {}}, 2: {}}}
d2 = {'a': {1: {}, 2: {'bar': {}}}}
d3 = {'b': {3: {'baz': {}}}}
d4 = {'a': {1: {'quux': {}}}}

这是最简单的递归情况,我建议两种朴素的方法:

def rec_merge1(d1, d2):
    '''return new merged dict of dicts'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge1(v, d2[k])
    d3 = d1.copy()
    d3.update(d2)
    return d3

def rec_merge2(d1, d2):
    '''update first dict with second recursively'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge2(v, d2[k])
    d1.update(d2)
    return d1

我相信我更喜欢第二个,但要记住,第一个的原始状态必须从其原始位置重建。这是用法:

>>> from functools import reduce # only required for Python 3.
>>> reduce(rec_merge1, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
>>> reduce(rec_merge2, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}

复杂的情况:“叶子是任何其他类型:”

因此,如果它们以字典结尾,这是合并末端空字典的简单情况。如果没有,那不是那么简单。如果是字符串,如何合并它们?可以类似地更新集合,因此我们可以进行这种处理,但是会丢失它们合并的顺序。那么顺序重要吗?

因此,代替更多信息,最简单的方法是在两个值都不都是dict的情况下为他们提供标准的更新处理:即第二个dict的值将覆盖第一个dict的值,即使第二个dict的值为None且第一个的值为a有很多信息的字典。

d1 = {'a': {1: 'foo', 2: None}}
d2 = {'a': {1: None, 2: 'bar'}}
d3 = {'b': {3: 'baz'}}
d4 = {'a': {1: 'quux'}}

from collections import MutableMapping

def rec_merge(d1, d2):
    '''
    Update two dicts of dicts recursively, 
    if either mapping has leaves that are non-dicts, 
    the second's leaf overwrites the first's.
    '''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            # this next check is the only difference!
            if all(isinstance(e, MutableMapping) for e in (v, d2[k])):
                d2[k] = rec_merge(v, d2[k])
            # we could further check types and merge as appropriate here.
    d3 = d1.copy()
    d3.update(d2)
    return d3

现在

from functools import reduce
reduce(rec_merge, (d1, d2, d3, d4))

退货

{'a': {1: 'quux', 2: 'bar'}, 'b': {3: 'baz'}}

适用于原始问题:

我必须删除字母周围的花括号并将其放在单引号中,以使其成为合法的Python(否则,它们将在Python 2.7+中设置为原义)并附加缺少的花括号:

dict1 = {1:{"a":'A'}, 2:{"b":'B'}}
dict2 = {2:{"c":'C'}, 3:{"d":'D'}}

rec_merge(dict1, dict2)现在返回:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

匹配原始问题的期望结果(例如,更改{A}为后)'A'

Dictionaries of dictionaries merge

As this is the canonical question (in spite of certain non-generalities) I’m providing the canonical Pythonic approach to solving this issue.

Simplest Case: “leaves are nested dicts that end in empty dicts”:

d1 = {'a': {1: {'foo': {}}, 2: {}}}
d2 = {'a': {1: {}, 2: {'bar': {}}}}
d3 = {'b': {3: {'baz': {}}}}
d4 = {'a': {1: {'quux': {}}}}

This is the simplest case for recursion, and I would recommend two naive approaches:

def rec_merge1(d1, d2):
    '''return new merged dict of dicts'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge1(v, d2[k])
    d3 = d1.copy()
    d3.update(d2)
    return d3

def rec_merge2(d1, d2):
    '''update first dict with second recursively'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge2(v, d2[k])
    d1.update(d2)
    return d1

I believe I would prefer the second to the first, but keep in mind that the original state of the first would have to be rebuilt from its origin. Here’s the usage:

>>> from functools import reduce # only required for Python 3.
>>> reduce(rec_merge1, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
>>> reduce(rec_merge2, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}

Complex Case: “leaves are of any other type:”

So if they end in dicts, it’s a simple case of merging the end empty dicts. If not, it’s not so trivial. If strings, how do you merge them? Sets can be updated similarly, so we could give that treatment, but we lose the order in which they were merged. So does order matter?

So in lieu of more information, the simplest approach will be to give them the standard update treatment if both values are not dicts: i.e. the second dict’s value will overwrite the first, even if the second dict’s value is None and the first’s value is a dict with a lot of info.

d1 = {'a': {1: 'foo', 2: None}}
d2 = {'a': {1: None, 2: 'bar'}}
d3 = {'b': {3: 'baz'}}
d4 = {'a': {1: 'quux'}}

from collections.abc import MutableMapping

def rec_merge(d1, d2):
    '''
    Update two dicts of dicts recursively, 
    if either mapping has leaves that are non-dicts, 
    the second's leaf overwrites the first's.
    '''
    for k, v in d1.items():
        if k in d2:
            # this next check is the only difference!
            if all(isinstance(e, MutableMapping) for e in (v, d2[k])):
                d2[k] = rec_merge(v, d2[k])
            # we could further check types and merge as appropriate here.
    d3 = d1.copy()
    d3.update(d2)
    return d3

And now

from functools import reduce
reduce(rec_merge, (d1, d2, d3, d4))

returns

{'a': {1: 'quux', 2: 'bar'}, 'b': {3: 'baz'}}

Application to the original question:

I’ve had to remove the curly braces around the letters and put them in single quotes for this to be legit Python (else they would be set literals in Python 2.7+) as well as append a missing brace:

dict1 = {1:{"a":'A'}, 2:{"b":'B'}}
dict2 = {2:{"c":'C'}, 3:{"d":'D'}}

and rec_merge(dict1, dict2) now returns:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

Which matches the desired outcome of the original question (after changing, e.g. the {A} to 'A'.)


回答 4

基于@andrew cooke。此版本处理字典的嵌套列表,还允许选择更新值

def merge(a, b, path=None, update=True):
    "http://stackoverflow.com/questions/7204805/python-dictionaries-of-dictionaries-merge"
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            elif isinstance(a[key], list) and isinstance(b[key], list):
                for idx, val in enumerate(b[key]):
                    a[key][idx] = merge(a[key][idx], b[key][idx], path + [str(key), str(idx)], update=update)
            elif update:
                a[key] = b[key]
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

Based on @andrew cooke. This version handles nested lists of dicts and also allows the option to update the values

def merge(a, b, path=None, update=True):
    "http://stackoverflow.com/questions/7204805/python-dictionaries-of-dictionaries-merge"
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            elif isinstance(a[key], list) and isinstance(b[key], list):
                for idx, val in enumerate(b[key]):
                    a[key][idx] = merge(a[key][idx], b[key][idx], path + [str(key), str(idx)], update=update)
            elif update:
                a[key] = b[key]
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

回答 5

这个简单的递归过程将一个字典合并到另一个字典中,同时覆盖冲突的键:

#!/usr/bin/env python2.7

def merge_dicts(dict1, dict2):
    """ Recursively merges dict2 into dict1 """
    if not isinstance(dict1, dict) or not isinstance(dict2, dict):
        return dict2
    for k in dict2:
        if k in dict1:
            dict1[k] = merge_dicts(dict1[k], dict2[k])
        else:
            dict1[k] = dict2[k]
    return dict1

print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))

输出:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}

This simple recursive procedure will merge one dictionary into another while overriding conflicting keys:

#!/usr/bin/env python2.7

def merge_dicts(dict1, dict2):
    """ Recursively merges dict2 into dict1 """
    if not isinstance(dict1, dict) or not isinstance(dict2, dict):
        return dict2
    for k in dict2:
        if k in dict1:
            dict1[k] = merge_dicts(dict1[k], dict2[k])
        else:
            dict1[k] = dict2[k]
    return dict1

print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))

Output:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}

回答 6

基于@andrew cooke的答案。它以更好的方式处理嵌套列表。

def deep_merge_lists(original, incoming):
    """
    Deep merge two lists. Modifies original.
    Recursively call deep merge on each correlated element of list. 
    If item type in both elements are
     a. dict: Call deep_merge_dicts on both values.
     b. list: Recursively call deep_merge_lists on both values.
     c. any other type: Value is overridden.
     d. conflicting types: Value is overridden.

    If length of incoming list is more that of original then extra values are appended.
    """
    common_length = min(len(original), len(incoming))
    for idx in range(common_length):
        if isinstance(original[idx], dict) and isinstance(incoming[idx], dict):
            deep_merge_dicts(original[idx], incoming[idx])

        elif isinstance(original[idx], list) and isinstance(incoming[idx], list):
            deep_merge_lists(original[idx], incoming[idx])

        else:
            original[idx] = incoming[idx]

    for idx in range(common_length, len(incoming)):
        original.append(incoming[idx])


def deep_merge_dicts(original, incoming):
    """
    Deep merge two dictionaries. Modifies original.
    For key conflicts if both values are:
     a. dict: Recursively call deep_merge_dicts on both values.
     b. list: Call deep_merge_lists on both values.
     c. any other type: Value is overridden.
     d. conflicting types: Value is overridden.

    """
    for key in incoming:
        if key in original:
            if isinstance(original[key], dict) and isinstance(incoming[key], dict):
                deep_merge_dicts(original[key], incoming[key])

            elif isinstance(original[key], list) and isinstance(incoming[key], list):
                deep_merge_lists(original[key], incoming[key])

            else:
                original[key] = incoming[key]
        else:
            original[key] = incoming[key]

Based on answers from @andrew cooke. It takes care of nested lists in a better way.

def deep_merge_lists(original, incoming):
    """
    Deep merge two lists. Modifies original.
    Recursively call deep merge on each correlated element of list. 
    If item type in both elements are
     a. dict: Call deep_merge_dicts on both values.
     b. list: Recursively call deep_merge_lists on both values.
     c. any other type: Value is overridden.
     d. conflicting types: Value is overridden.

    If length of incoming list is more that of original then extra values are appended.
    """
    common_length = min(len(original), len(incoming))
    for idx in range(common_length):
        if isinstance(original[idx], dict) and isinstance(incoming[idx], dict):
            deep_merge_dicts(original[idx], incoming[idx])

        elif isinstance(original[idx], list) and isinstance(incoming[idx], list):
            deep_merge_lists(original[idx], incoming[idx])

        else:
            original[idx] = incoming[idx]

    for idx in range(common_length, len(incoming)):
        original.append(incoming[idx])


def deep_merge_dicts(original, incoming):
    """
    Deep merge two dictionaries. Modifies original.
    For key conflicts if both values are:
     a. dict: Recursively call deep_merge_dicts on both values.
     b. list: Call deep_merge_lists on both values.
     c. any other type: Value is overridden.
     d. conflicting types: Value is overridden.

    """
    for key in incoming:
        if key in original:
            if isinstance(original[key], dict) and isinstance(incoming[key], dict):
                deep_merge_dicts(original[key], incoming[key])

            elif isinstance(original[key], list) and isinstance(incoming[key], list):
                deep_merge_lists(original[key], incoming[key])

            else:
                original[key] = incoming[key]
        else:
            original[key] = incoming[key]

回答 7

如果您的词典级别未知,那么我建议使用递归函数:

def combineDicts(dictionary1, dictionary2):
    output = {}
    for item, value in dictionary1.iteritems():
        if dictionary2.has_key(item):
            if isinstance(dictionary2[item], dict):
                output[item] = combineDicts(value, dictionary2.pop(item))
        else:
            output[item] = value
    for item, value in dictionary2.iteritems():
         output[item] = value
    return output

If you have an unknown level of dictionaries, then I would suggest a recursive function:

def combineDicts(dictionary1, dictionary2):
    output = {}
    for item, value in dictionary1.iteritems():
        if dictionary2.has_key(item):
            if isinstance(dictionary2[item], dict):
                output[item] = combineDicts(value, dictionary2.pop(item))
        else:
            output[item] = value
    for item, value in dictionary2.iteritems():
         output[item] = value
    return output

回答 8

总览

以下方法将dict的深度合并问题细分为:

  1. 参数化的浅表合并函数merge(f)(a,b),该函数使用一个函数f合并两个字典ab

  2. 递归合并函数f将与merge


实作

可以通过多种方式来编写用于合并两个(非嵌套)字典的函数。我个人喜欢

def merge(f):
    def merge(a,b): 
        keys = a.keys() | b.keys()
        return {key:f(a.get(key), b.get(key)) for key in keys}
    return merge

定义适当的递归合并函数的一种好方法f是使用多调度,它允许定义根据参数类型沿不同路径求值的函数。

from multipledispatch import dispatch

#for anything that is not a dict return
@dispatch(object, object)
def f(a, b):
    return b if b is not None else a

#for dicts recurse 
@dispatch(dict, dict)
def f(a,b):
    return merge(f)(a,b)

要合并两个嵌套的字典,只需使用merge(f)例如:

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
merge(f)(dict1, dict2)
#returns {1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}} 

笔记:

这种方法的优点是:

  • 该函数是由较小的函数构建的,每个较小的函数都做一件事情,这使代码更易于推理和测试。

  • 该行为不是硬编码的,但是可以根据需要进行更改和扩展,从而提高了代码重用性(请参见下面的示例)。


客制化

一些答案还考虑了包含例如其他(可能嵌套的)字典列表的字典。在这种情况下,可能需要映射列表并根据位置合并它们。这可以通过向合并功能添加另一个定义来完成f

import itertools
@dispatch(list, list)
def f(a,b):
    return [merge(f)(*arg) for arg in itertools.zip_longest(a, b)]

Overview

The following approach subdivides the problem of a deep merge of dicts into:

  1. A parameterized shallow merge function merge(f)(a,b) that uses a function f to merge two dicts a and b

  2. A recursive merger function f to be used together with merge


Implementation

A function for merging two (non nested) dicts can be written in a lot of ways. I personally like

def merge(f):
    def merge(a,b): 
        keys = a.keys() | b.keys()
        return {key:f(a.get(key), b.get(key)) for key in keys}
    return merge

A nice way of defining an appropriate recursive merger function f is using multipledispatch which allows to define functions that evaluate along different paths depending on the type of their arguments.

from multipledispatch import dispatch

#for anything that is not a dict return
@dispatch(object, object)
def f(a, b):
    return b if b is not None else a

#for dicts recurse 
@dispatch(dict, dict)
def f(a,b):
    return merge(f)(a,b)

Example

To merge two nested dicts simply use merge(f) e.g.:

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
merge(f)(dict1, dict2)
#returns {1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}} 

Notes:

The advantages of this approach are:

  • The function is build from smaller functions that each do a single thing which makes the code simpler to reason about and test

  • The behaviour is not hard-coded but can be changed and extended as needed which improves code reuse (see example below).


Customization

Some answers also considered dicts that contain lists e.g. of other (potentially nested) dicts. In this case one might want map over the lists and merge them based on position. This can be done by adding another definition to the merger function f:

import itertools
@dispatch(list, list)
def f(a,b):
    return [merge(f)(*arg) for arg in itertools.zip_longest(a, b)]

回答 9

如果有人想要解决这个问题,这是我的解决方案。

美德:简短,说明性且具有风格上的功能(递归,无突变)。

潜在缺点:这可能不是您要查找的合并。有关语义,请查阅文档字符串。

def deep_merge(a, b):
    """
    Merge two values, with `b` taking precedence over `a`.

    Semantics:
    - If either `a` or `b` is not a dictionary, `a` will be returned only if
      `b` is `None`. Otherwise `b` will be returned.
    - If both values are dictionaries, they are merged as follows:
        * Each key that is found only in `a` or only in `b` will be included in
          the output collection with its value intact.
        * For any key in common between `a` and `b`, the corresponding values
          will be merged with the same semantics.
    """
    if not isinstance(a, dict) or not isinstance(b, dict):
        return a if b is None else b
    else:
        # If we're here, both a and b must be dictionaries or subtypes thereof.

        # Compute set of all keys in both dictionaries.
        keys = set(a.keys()) | set(b.keys())

        # Build output dictionary, merging recursively values with common keys,
        # where `None` is used to mean the absence of a value.
        return {
            key: deep_merge(a.get(key), b.get(key))
            for key in keys
        }

In case someone wants yet another approach to this problem, here’s my solution.

Virtues: short, declarative, and functional in style (recursive, does no mutation).

Potential Drawback: This might not be the merge you’re looking for. Consult the docstring for semantics.

def deep_merge(a, b):
    """
    Merge two values, with `b` taking precedence over `a`.

    Semantics:
    - If either `a` or `b` is not a dictionary, `a` will be returned only if
      `b` is `None`. Otherwise `b` will be returned.
    - If both values are dictionaries, they are merged as follows:
        * Each key that is found only in `a` or only in `b` will be included in
          the output collection with its value intact.
        * For any key in common between `a` and `b`, the corresponding values
          will be merged with the same semantics.
    """
    if not isinstance(a, dict) or not isinstance(b, dict):
        return a if b is None else b
    else:
        # If we're here, both a and b must be dictionaries or subtypes thereof.

        # Compute set of all keys in both dictionaries.
        keys = set(a.keys()) | set(b.keys())

        # Build output dictionary, merging recursively values with common keys,
        # where `None` is used to mean the absence of a value.
        return {
            key: deep_merge(a.get(key), b.get(key))
            for key in keys
        }

回答 10

您可以尝试mergedeep


安装

$ pip3 install mergedeep

用法

from mergedeep import merge

a = {"keyA": 1}
b = {"keyB": {"sub1": 10}}
c = {"keyB": {"sub2": 20}}

merge(a, b, c) 

print(a)
# {"keyA": 1, "keyB": {"sub1": 10, "sub2": 20}}

有关选项的完整列表,请查看文档

You could try mergedeep.


Installation

$ pip3 install mergedeep

Usage

from mergedeep import merge

a = {"keyA": 1}
b = {"keyB": {"sub1": 10}}
c = {"keyB": {"sub2": 20}}

merge(a, b, c) 

print(a)
# {"keyA": 1, "keyB": {"sub1": 10, "sub2": 20}}

For a full list of options, check out the docs!


回答 11

安德鲁·库克斯答案有一个小问题:在某些情况下,b当您修改返回的字典时,它会修改第二个参数。特别是因为这一行:

if key in a:
    ...
else:
    a[key] = b[key]

如果b[key]为a dict,则将其简单地分配给a,这意味着对它的任何后续修改dict都会影响ab

a={}
b={'1':{'2':'b'}}
c={'1':{'3':'c'}}
merge(merge(a,b), c) # {'1': {'3': 'c', '2': 'b'}}
a # {'1': {'3': 'c', '2': 'b'}} (as expected)
b # {'1': {'3': 'c', '2': 'b'}} <----
c # {'1': {'3': 'c'}} (unmodified)

为了解决这个问题,该行必须替换为:

if isinstance(b[key], dict):
    a[key] = clone_dict(b[key])
else:
    a[key] = b[key]

在哪里clone_dict

def clone_dict(obj):
    clone = {}
    for key, value in obj.iteritems():
        if isinstance(value, dict):
            clone[key] = clone_dict(value)
        else:
            clone[key] = value
    return

仍然。这显然不占listset和其他的东西,但我希望尝试合并时,它说明了陷阱dicts

为了完整起见,这是我的版本,您可以在其中多次传递它dicts

def merge_dicts(*args):
    def clone_dict(obj):
        clone = {}
        for key, value in obj.iteritems():
            if isinstance(value, dict):
                clone[key] = clone_dict(value)
            else:
                clone[key] = value
        return

    def merge(a, b, path=[]):
        for key in b:
            if key in a:
                if isinstance(a[key], dict) and isinstance(b[key], dict):
                    merge(a[key], b[key], path + [str(key)])
                elif a[key] == b[key]:
                    pass
                else:
                    raise Exception('Conflict at `{path}\''.format(path='.'.join(path + [str(key)])))
            else:
                if isinstance(b[key], dict):
                    a[key] = clone_dict(b[key])
                else:
                    a[key] = b[key]
        return a
    return reduce(merge, args, {})

There’s a slight problem with andrew cookes answer: In some cases it modifies the second argument b when you modify the returned dict. Specifically it’s because of this line:

if key in a:
    ...
else:
    a[key] = b[key]

If b[key] is a dict, it will simply be assigned to a, meaning any subsequent modifications to that dict will affect both a and b.

a={}
b={'1':{'2':'b'}}
c={'1':{'3':'c'}}
merge(merge(a,b), c) # {'1': {'3': 'c', '2': 'b'}}
a # {'1': {'3': 'c', '2': 'b'}} (as expected)
b # {'1': {'3': 'c', '2': 'b'}} <----
c # {'1': {'3': 'c'}} (unmodified)

To fix this, the line would have to be substituted with this:

if isinstance(b[key], dict):
    a[key] = clone_dict(b[key])
else:
    a[key] = b[key]

Where clone_dict is:

def clone_dict(obj):
    clone = {}
    for key, value in obj.iteritems():
        if isinstance(value, dict):
            clone[key] = clone_dict(value)
        else:
            clone[key] = value
    return

Still. This obviously doesn’t account for list, set and other stuff, but I hope it illustrates the pitfalls when trying to merge dicts.

And for completeness sake, here is my version, where you can pass it multiple dicts:

def merge_dicts(*args):
    def clone_dict(obj):
        clone = {}
        for key, value in obj.iteritems():
            if isinstance(value, dict):
                clone[key] = clone_dict(value)
            else:
                clone[key] = value
        return

    def merge(a, b, path=[]):
        for key in b:
            if key in a:
                if isinstance(a[key], dict) and isinstance(b[key], dict):
                    merge(a[key], b[key], path + [str(key)])
                elif a[key] == b[key]:
                    pass
                else:
                    raise Exception('Conflict at `{path}\''.format(path='.'.join(path + [str(key)])))
            else:
                if isinstance(b[key], dict):
                    a[key] = clone_dict(b[key])
                else:
                    a[key] = b[key]
        return a
    return reduce(merge, args, {})

回答 12

此版本的函数将占N个字典,仅占字典-无法传递不正确的参数,否则将引发TypeError。合并本身解决了键冲突,而不是覆盖合并链下游的词典中的数据,它创建了一组值并将其追加到该值之后;没有数据丢失。

它在页面上可能不是最有效的,但是却是最彻底的,并且在合并2到N个字典时,您不会丢失任何信息。

def merge_dicts(*dicts):
    if not reduce(lambda x, y: isinstance(y, dict) and x, dicts, True):
        raise TypeError, "Object in *dicts not of type dict"
    if len(dicts) < 2:
        raise ValueError, "Requires 2 or more dict objects"


    def merge(a, b):
        for d in set(a.keys()).union(b.keys()):
            if d in a and d in b:
                if type(a[d]) == type(b[d]):
                    if not isinstance(a[d], dict):
                        ret = list({a[d], b[d]})
                        if len(ret) == 1: ret = ret[0]
                        yield (d, sorted(ret))
                    else:
                        yield (d, dict(merge(a[d], b[d])))
                else:
                    raise TypeError, "Conflicting key:value type assignment"
            elif d in a:
                yield (d, a[d])
            elif d in b:
                yield (d, b[d])
            else:
                raise KeyError

    return reduce(lambda x, y: dict(merge(x, y)), dicts[1:], dicts[0])

print merge_dicts({1:1,2:{1:2}},{1:2,2:{3:1}},{4:4})

输出:{1:[1,2],2:{1:2,3:3:1},4:4}

This version of the function will account for N number of dictionaries, and only dictionaries — no improper parameters can be passed, or it will raise a TypeError. The merge itself accounts for key conflicts, and instead of overwriting data from a dictionary further down the merge chain, it creates a set of values and appends to that; no data is lost.

It might not be the most effecient on the page, but it’s the most thorough and you’re not going to lose any information when you merge your 2 to N dicts.

def merge_dicts(*dicts):
    if not reduce(lambda x, y: isinstance(y, dict) and x, dicts, True):
        raise TypeError, "Object in *dicts not of type dict"
    if len(dicts) < 2:
        raise ValueError, "Requires 2 or more dict objects"


    def merge(a, b):
        for d in set(a.keys()).union(b.keys()):
            if d in a and d in b:
                if type(a[d]) == type(b[d]):
                    if not isinstance(a[d], dict):
                        ret = list({a[d], b[d]})
                        if len(ret) == 1: ret = ret[0]
                        yield (d, sorted(ret))
                    else:
                        yield (d, dict(merge(a[d], b[d])))
                else:
                    raise TypeError, "Conflicting key:value type assignment"
            elif d in a:
                yield (d, a[d])
            elif d in b:
                yield (d, b[d])
            else:
                raise KeyError

    return reduce(lambda x, y: dict(merge(x, y)), dicts[1:], dicts[0])

print merge_dicts({1:1,2:{1:2}},{1:2,2:{3:1}},{4:4})

output: {1: [1, 2], 2: {1: 2, 3: 1}, 4: 4}


回答 13

由于dictviews支持集合操作,因此我能够大大简化jterrace的答案。

def merge(dict1, dict2):
    for k in dict1.keys() - dict2.keys():
        yield (k, dict1[k])

    for k in dict2.keys() - dict1.keys():
        yield (k, dict2[k])

    for k in dict1.keys() & dict2.keys():
        yield (k, dict(merge(dict1[k], dict2[k])))

任何尝试将dict与非dict组合在一起的尝试(从技术上讲,是指具有“ keys”方法的对象,而没有“ keys”方法的对象)将引发AttributeError。这包括对函数的初始调用和递归调用。这正是我想要的,所以我离开了。您可以轻松地捕获递归调用引发的AttributeErrors,然后产生您想要的任何值。

Since dictviews support set operations, I was able to greatly simplify jterrace’s answer.

def merge(dict1, dict2):
    for k in dict1.keys() - dict2.keys():
        yield (k, dict1[k])

    for k in dict2.keys() - dict1.keys():
        yield (k, dict2[k])

    for k in dict1.keys() & dict2.keys():
        yield (k, dict(merge(dict1[k], dict2[k])))

Any attempt to combine a dict with a non dict (technically, an object with a ‘keys’ method and an object without a ‘keys’ method) will raise an AttributeError. This includes both the initial call to the function and recursive calls. This is exactly what I wanted so I left it. You could easily catch an AttributeErrors thrown by the recursive call and then yield any value you please.


回答 14

短n甜:

from collections.abc import MutableMapping as Map

def nested_update(d, v):
"""
Nested update of dict-like 'd' with dict-like 'v'.
"""

for key in v:
    if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
        nested_update(d[key], v[key])
    else:
        d[key] = v[key]

它的工作方式类似于Python的dict.update方法(并建立在该方法的基础上)。它会返回Nonereturn d如果愿意,可以随时添加),因为它会d原位更新dict 。中的键v将覆盖其中的任何现有键d(它不会尝试解释字典的内容)。

它也适用于其他(“类似字典”的)映射。

Short-n-sweet:

from collections.abc import MutableMapping as Map

def nested_update(d, v):
"""
Nested update of dict-like 'd' with dict-like 'v'.
"""

for key in v:
    if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
        nested_update(d[key], v[key])
    else:
        d[key] = v[key]

This works like (and is build on) Python’s dict.update method. It returns None (you can always add return d if you prefer) as it updates dict d in-place. Keys in v will overwrite any existing keys in d (it does not try to interpret the dict’s contents).

It will also work for other (“dict-like”) mappings.


回答 15

当然,代码将取决于您解决合并冲突的规则。这是一个可以使用任意数量的参数并将其递归合并到任意深度的版本,而无需使用任何对象突变。它使用以下规则解决合并冲突:

  • 字典优先于非字典值({"foo": {...}}优先于{"foo": "bar"}
  • 随后的参数优先于前面的参数(如果合并{"a": 1}{"a", 2}以及{"a": 3}为了,其结果将是{"a": 3}
try:
    from collections import Mapping
except ImportError:
    Mapping = dict

def merge_dicts(*dicts):                                                            
    """                                                                             
    Return a new dictionary that is the result of merging the arguments together.   
    In case of conflicts, later arguments take precedence over earlier arguments.   
    """                                                                             
    updated = {}                                                                    
    # grab all keys                                                                 
    keys = set()                                                                    
    for d in dicts:                                                                 
        keys = keys.union(set(d))                                                   

    for key in keys:                                                                
        values = [d[key] for d in dicts if key in d]                                
        # which ones are mapping types? (aka dict)                                  
        maps = [value for value in values if isinstance(value, Mapping)]            
        if maps:                                                                    
            # if we have any mapping types, call recursively to merge them          
            updated[key] = merge_dicts(*maps)                                       
        else:                                                                       
            # otherwise, just grab the last value we have, since later arguments    
            # take precedence over earlier arguments                                
            updated[key] = values[-1]                                               
    return updated  

The code will depend on your rules for resolving merge conflicts, of course. Here’s a version which can take an arbitrary number of arguments and merges them recursively to an arbitrary depth, without using any object mutation. It uses the following rules to resolve merge conflicts:

  • dictionaries take precedence over non-dict values ({"foo": {...}} takes precedence over {"foo": "bar"})
  • later arguments take precedence over earlier arguments (if you merge {"a": 1}, {"a", 2}, and {"a": 3} in order, the result will be {"a": 3})
try:
    from collections import Mapping
except ImportError:
    Mapping = dict

def merge_dicts(*dicts):                                                            
    """                                                                             
    Return a new dictionary that is the result of merging the arguments together.   
    In case of conflicts, later arguments take precedence over earlier arguments.   
    """                                                                             
    updated = {}                                                                    
    # grab all keys                                                                 
    keys = set()                                                                    
    for d in dicts:                                                                 
        keys = keys.union(set(d))                                                   

    for key in keys:                                                                
        values = [d[key] for d in dicts if key in d]                                
        # which ones are mapping types? (aka dict)                                  
        maps = [value for value in values if isinstance(value, Mapping)]            
        if maps:                                                                    
            # if we have any mapping types, call recursively to merge them          
            updated[key] = merge_dicts(*maps)                                       
        else:                                                                       
            # otherwise, just grab the last value we have, since later arguments    
            # take precedence over earlier arguments                                
            updated[key] = values[-1]                                               
    return updated  

回答 16

我有两个字典(ab),每个字典可以包含任意数量的嵌套字典。我想以b优先于递归合并它们a

将嵌套字典视为树,我想要的是:

  • 更新a以使到每个叶子的每条路径都b表示为a
  • 要覆盖的子树a,如果叶在相应路径中找到b
    • 保持所有b叶节点仍为​​叶的不变性。

对于我的口味而言,现有的答案有些复杂,并且在架子上留下了一些细节。我一起整理了以下内容,这些内容通过了我的数据集的单元测试。

  def merge_map(a, b):
    if not isinstance(a, dict) or not isinstance(b, dict):
      return b

    for key in b.keys():
      a[key] = merge_map(a[key], b[key]) if key in a else b[key]
    return a

示例(为清晰起见而格式化):

 a = {
    1 : {'a': 'red', 
         'b': {'blue': 'fish', 'yellow': 'bear' },
         'c': { 'orange': 'dog'},
    },
    2 : {'d': 'green'},
    3: 'e'
  }

  b = {
    1 : {'b': 'white'},
    2 : {'d': 'black'},
    3: 'e'
  }


  >>> merge_map(a, b)
  {1: {'a': 'red', 
       'b': 'white',
       'c': {'orange': 'dog'},},
   2: {'d': 'black'},
   3: 'e'}

b需要维护的路径是:

  • 1 -> 'b' -> 'white'
  • 2 -> 'd' -> 'black'
  • 3 -> 'e'

a 具有以下独特且无冲突的路径:

  • 1 -> 'a' -> 'red'
  • 1 -> 'c' -> 'orange' -> 'dog'

因此它们仍显示在合并地图中。

I had two dictionaries (a and b) which could each contain any number of nested dictionaries. I wanted to recursively merge them, with b taking precedence over a.

Considering the nested dictionaries as trees, what I wanted was:

  • To update a so that every path to every leaf in b would be represented in a
  • To overwrite subtrees of a if a leaf is found in the corresponding path in b
    • Maintain the invariant that all b leaf nodes remain leafs.

The existing answers were a little complicated for my taste and left some details on the shelf. I hacked together the following, which passes unit tests for my data set.

  def merge_map(a, b):
    if not isinstance(a, dict) or not isinstance(b, dict):
      return b

    for key in b.keys():
      a[key] = merge_map(a[key], b[key]) if key in a else b[key]
    return a

Example (formatted for clarity):

 a = {
    1 : {'a': 'red', 
         'b': {'blue': 'fish', 'yellow': 'bear' },
         'c': { 'orange': 'dog'},
    },
    2 : {'d': 'green'},
    3: 'e'
  }

  b = {
    1 : {'b': 'white'},
    2 : {'d': 'black'},
    3: 'e'
  }


  >>> merge_map(a, b)
  {1: {'a': 'red', 
       'b': 'white',
       'c': {'orange': 'dog'},},
   2: {'d': 'black'},
   3: 'e'}

The paths in b that needed to be maintained were:

  • 1 -> 'b' -> 'white'
  • 2 -> 'd' -> 'black'
  • 3 -> 'e'.

a had the unique and non-conflicting paths of:

  • 1 -> 'a' -> 'red'
  • 1 -> 'c' -> 'orange' -> 'dog'

so they are still represented in the merged map.


回答 17

我有一个迭代的解决方案-大字典和很多字典(例如json等)的效果要好得多:

import collections


def merge_dict_with_subdicts(dict1: dict, dict2: dict) -> dict:
    """
    similar behaviour to builtin dict.update - but knows how to handle nested dicts
    """
    q = collections.deque([(dict1, dict2)])
    while len(q) > 0:
        d1, d2 = q.pop()
        for k, v in d2.items():
            if k in d1 and isinstance(d1[k], dict) and isinstance(v, dict):
                q.append((d1[k], v))
            else:
                d1[k] = v

    return dict1

请注意,如果它们不是两个字典,则将使用d2中的值覆盖d1。(与python相同dict.update()

一些测试:

def test_deep_update():
    d = dict()
    merge_dict_with_subdicts(d, {"a": 4})
    assert d == {"a": 4}

    new_dict = {
        "b": {
            "c": {
                "d": 6
            }
        }
    }
    merge_dict_with_subdicts(d, new_dict)
    assert d == {
        "a": 4,
        "b": {
            "c": {
                "d": 6
            }
        }
    }

    new_dict = {
        "a": 3,
        "b": {
            "f": 7
        }
    }
    merge_dict_with_subdicts(d, new_dict)
    assert d == {
        "a": 3,
        "b": {
            "c": {
                "d": 6
            },
            "f": 7
        }
    }

    # test a case where one of the dicts has dict as value and the other has something else
    new_dict = {
        'a': {
            'b': 4
        }
    }
    merge_dict_with_subdicts(d, new_dict)
    assert d['a']['b'] == 4

我已经测试了约1200格-该方法花费了0.4秒,而递归解决方案花费了约2.5秒。

I have an iterative solution – works much much better with big dicts & a lot of them (for example jsons etc):

import collections


def merge_dict_with_subdicts(dict1: dict, dict2: dict) -> dict:
    """
    similar behaviour to builtin dict.update - but knows how to handle nested dicts
    """
    q = collections.deque([(dict1, dict2)])
    while len(q) > 0:
        d1, d2 = q.pop()
        for k, v in d2.items():
            if k in d1 and isinstance(d1[k], dict) and isinstance(v, dict):
                q.append((d1[k], v))
            else:
                d1[k] = v

    return dict1

note that this will use the value in d2 to override d1, in case they are not both dicts. (same as python’s dict.update())

some tests:

def test_deep_update():
    d = dict()
    merge_dict_with_subdicts(d, {"a": 4})
    assert d == {"a": 4}

    new_dict = {
        "b": {
            "c": {
                "d": 6
            }
        }
    }
    merge_dict_with_subdicts(d, new_dict)
    assert d == {
        "a": 4,
        "b": {
            "c": {
                "d": 6
            }
        }
    }

    new_dict = {
        "a": 3,
        "b": {
            "f": 7
        }
    }
    merge_dict_with_subdicts(d, new_dict)
    assert d == {
        "a": 3,
        "b": {
            "c": {
                "d": 6
            },
            "f": 7
        }
    }

    # test a case where one of the dicts has dict as value and the other has something else
    new_dict = {
        'a': {
            'b': 4
        }
    }
    merge_dict_with_subdicts(d, new_dict)
    assert d['a']['b'] == 4

I’ve tested with around ~1200 dicts – this method took 0.4 seconds, while the recursive solution took ~2.5 seconds.


回答 18

这应该有助于将所有项目合并dict2dict1

for item in dict2:
    if item in dict1:
        for leaf in dict2[item]:
            dict1[item][leaf] = dict2[item][leaf]
    else:
        dict1[item] = dict2[item]

请测试一下,然后告诉我们这是否是您想要的。

编辑:

上述解决方案仅合并了一个级别,但正确地解决了OP给出的示例。要合并多个级别,应使用递归。

This should help in merging all items from dict2 into dict1:

for item in dict2:
    if item in dict1:
        for leaf in dict2[item]:
            dict1[item][leaf] = dict2[item][leaf]
    else:
        dict1[item] = dict2[item]

Please test it and tell us whether this is what you wanted.

EDIT:

The above mentioned solution merges only one level, but correctly solves the example given by OP. To merge multiple levels, the recursion should be used.


回答 19

我一直在测试您的解决方案,并决定在我的项目中使用此解决方案:

def mergedicts(dict1, dict2, conflict, no_conflict):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            yield (k, conflict(dict1[k], dict2[k]))
        elif k in dict1:
            yield (k, no_conflict(dict1[k]))
        else:
            yield (k, no_conflict(dict2[k]))

dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}

#this helper function allows for recursion and the use of reduce
def f2(x, y):
    return dict(mergedicts(x, y, f2, lambda x: x))

print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))

将函数作为参数传递是扩展jterrace解决方案以使其与所有其他递归解决方案一样起作用的关键。

I’ve been testing your solutions and decided to use this one in my project:

def mergedicts(dict1, dict2, conflict, no_conflict):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            yield (k, conflict(dict1[k], dict2[k]))
        elif k in dict1:
            yield (k, no_conflict(dict1[k]))
        else:
            yield (k, no_conflict(dict2[k]))

dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}

#this helper function allows for recursion and the use of reduce
def f2(x, y):
    return dict(mergedicts(x, y, f2, lambda x: x))

print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))

Passing functions as parameteres is key to extend jterrace solution to behave as all the other recursive solutions.


回答 20

我能想到的最简单的方法是:

#!/usr/bin/python

from copy import deepcopy
def dict_merge(a, b):
    if not isinstance(b, dict):
        return b
    result = deepcopy(a)
    for k, v in b.iteritems():
        if k in result and isinstance(result[k], dict):
                result[k] = dict_merge(result[k], v)
        else:
            result[k] = deepcopy(v)
    return result

a = {1:{"a":'A'}, 2:{"b":'B'}}
b = {2:{"c":'C'}, 3:{"d":'D'}}

print dict_merge(a,b)

输出:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

Easiest way i can think of is :

#!/usr/bin/python

from copy import deepcopy
def dict_merge(a, b):
    if not isinstance(b, dict):
        return b
    result = deepcopy(a)
    for k, v in b.iteritems():
        if k in result and isinstance(result[k], dict):
                result[k] = dict_merge(result[k], v)
        else:
            result[k] = deepcopy(v)
    return result

a = {1:{"a":'A'}, 2:{"b":'B'}}
b = {2:{"c":'C'}, 3:{"d":'D'}}

print dict_merge(a,b)

Output:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

回答 21

我在这里还有另一个略有不同的解决方案:

def deepMerge(d1, d2, inconflict = lambda v1,v2 : v2) :
''' merge d2 into d1. using inconflict function to resolve the leaf conflicts '''
    for k in d2:
        if k in d1 : 
            if isinstance(d1[k], dict) and isinstance(d2[k], dict) :
                deepMerge(d1[k], d2[k], inconflict)
            elif d1[k] != d2[k] :
                d1[k] = inconflict(d1[k], d2[k])
        else :
            d1[k] = d2[k]
    return d1

默认情况下,它可以解决冲突,而有利于第二个dict的值,但是您可以轻松地覆盖它,通过做些麻烦,您甚至可以抛出异常。:)。

I have another slightly different solution here:

def deepMerge(d1, d2, inconflict = lambda v1,v2 : v2) :
''' merge d2 into d1. using inconflict function to resolve the leaf conflicts '''
    for k in d2:
        if k in d1 : 
            if isinstance(d1[k], dict) and isinstance(d2[k], dict) :
                deepMerge(d1[k], d2[k], inconflict)
            elif d1[k] != d2[k] :
                d1[k] = inconflict(d1[k], d2[k])
        else :
            d1[k] = d2[k]
    return d1

By default it resolves conflicts in favor of values from the second dict, but you can easily override this, with some witchery you may be able to even throw exceptions out of it. :).


回答 22

class Utils(object):

    """

    >>> a = { 'first' : { 'all_rows' : { 'pass' : 'dog', 'number' : '1' } } }
    >>> b = { 'first' : { 'all_rows' : { 'fail' : 'cat', 'number' : '5' } } }
    >>> Utils.merge_dict(b, a) == { 'first' : { 'all_rows' : { 'pass' : 'dog', 'fail' : 'cat', 'number' : '5' } } }
    True

    >>> main = {'a': {'b': {'test': 'bug'}, 'c': 'C'}}
    >>> suply = {'a': {'b': 2, 'd': 'D', 'c': {'test': 'bug2'}}}
    >>> Utils.merge_dict(main, suply) == {'a': {'b': {'test': 'bug'}, 'c': 'C', 'd': 'D'}}
    True

    """

    @staticmethod
    def merge_dict(main, suply):
        """
        获取融合的字典,以main为主,suply补充,冲突时以main为准
        :return:
        """
        for key, value in suply.items():
            if key in main:
                if isinstance(main[key], dict):
                    if isinstance(value, dict):
                        Utils.merge_dict(main[key], value)
                    else:
                        pass
                else:
                    pass
            else:
                main[key] = value
        return main

if __name__ == '__main__':
    import doctest
    doctest.testmod()
class Utils(object):

    """

    >>> a = { 'first' : { 'all_rows' : { 'pass' : 'dog', 'number' : '1' } } }
    >>> b = { 'first' : { 'all_rows' : { 'fail' : 'cat', 'number' : '5' } } }
    >>> Utils.merge_dict(b, a) == { 'first' : { 'all_rows' : { 'pass' : 'dog', 'fail' : 'cat', 'number' : '5' } } }
    True

    >>> main = {'a': {'b': {'test': 'bug'}, 'c': 'C'}}
    >>> suply = {'a': {'b': 2, 'd': 'D', 'c': {'test': 'bug2'}}}
    >>> Utils.merge_dict(main, suply) == {'a': {'b': {'test': 'bug'}, 'c': 'C', 'd': 'D'}}
    True

    """

    @staticmethod
    def merge_dict(main, suply):
        """
        获取融合的字典,以main为主,suply补充,冲突时以main为准
        :return:
        """
        for key, value in suply.items():
            if key in main:
                if isinstance(main[key], dict):
                    if isinstance(value, dict):
                        Utils.merge_dict(main[key], value)
                    else:
                        pass
                else:
                    pass
            else:
                main[key] = value
        return main

if __name__ == '__main__':
    import doctest
    doctest.testmod()

回答 23

嘿,我也有同样的问题,但是我有一个解决方案,我会把它发布在这里,以防它对其他人也有用,基本上是合并嵌套的字典并添加值,对我来说,我需要计算一些概率,因此一个很好的工作:

#used to copy a nested dict to a nested dict
def deepupdate(target, src):
    for k, v in src.items():
        if k in target:
            for k2, v2 in src[k].items():
                if k2 in target[k]:
                    target[k][k2]+=v2
                else:
                    target[k][k2] = v2
        else:
            target[k] = copy.deepcopy(v)

通过使用以上方法,我们可以合并:

target = {‘6,6’:{‘6,63’:1},’63,4’:{‘4,4’:1},’4,4’:{‘4,3’:1} ,’6,63’:{’63,4’:1}}

src = {‘5,4’:{‘4,4’:1},’5,5’:{‘5,4’:1},’4,4’:{‘4,3’:1} }

它将变成:{‘5,5’:{‘5,4’:1},’5,4’:{‘4,4’:1},’6,6’:{‘6,63’ :1},’63,4’:{‘4,4’:1},’4,4’:{‘4,3’:2},’6,63’:{’63,4’:1 }}

还请注意此处的更改:

target = {‘6,6’:{‘6,63’:1},’6,63’:{’63,4’:1},‘4,4’:{‘4,3’:1},’63,4’:{‘4,4’:1}}

src = {‘5,4’:{‘4,4’:1},’4,3’:{‘3,4’:1},‘4,4’:{‘4,9’:1},’3,4’:{‘4,4’:1},’5,5’:{‘5,4’:1}}

合并= {‘5,4’:{‘4,4’:1},’4,3’:{‘3,4’:1},’6,63’:{’63,4’:1} ,’5,5’:{‘5,4’:1},’6,6’:{‘6,63’:1},’3,4’:{‘4,4’:1},’ 63,4’:{‘4,4’:1},‘4,4’:{‘4,3’:1,’4,9’:1} }}

别忘了还要添加导入副本:

import copy

hey there I also had the same problem but I though of a solution and I will post it here, in case it is also useful for others, basically merging nested dictionaries and also adding the values, for me I needed to calculate some probabilities so this one worked great:

#used to copy a nested dict to a nested dict
def deepupdate(target, src):
    for k, v in src.items():
        if k in target:
            for k2, v2 in src[k].items():
                if k2 in target[k]:
                    target[k][k2]+=v2
                else:
                    target[k][k2] = v2
        else:
            target[k] = copy.deepcopy(v)

by using the above method we can merge:

target = {‘6,6’: {‘6,63’: 1}, ‘63,4’: {‘4,4’: 1}, ‘4,4’: {‘4,3’: 1}, ‘6,63’: {‘63,4’: 1}}

src = {‘5,4’: {‘4,4’: 1}, ‘5,5’: {‘5,4’: 1}, ‘4,4’: {‘4,3’: 1}}

and this will become: {‘5,5’: {‘5,4’: 1}, ‘5,4’: {‘4,4’: 1}, ‘6,6’: {‘6,63’: 1}, ‘63,4’: {‘4,4’: 1}, ‘4,4’: {‘4,3’: 2}, ‘6,63’: {‘63,4’: 1}}

also notice the changes here:

target = {‘6,6’: {‘6,63’: 1}, ‘6,63’: {‘63,4’: 1}, ‘4,4’: {‘4,3’: 1}, ‘63,4’: {‘4,4’: 1}}

src = {‘5,4’: {‘4,4’: 1}, ‘4,3’: {‘3,4’: 1}, ‘4,4’: {‘4,9’: 1}, ‘3,4’: {‘4,4’: 1}, ‘5,5’: {‘5,4’: 1}}

merge = {‘5,4’: {‘4,4’: 1}, ‘4,3’: {‘3,4’: 1}, ‘6,63’: {‘63,4’: 1}, ‘5,5’: {‘5,4’: 1}, ‘6,6’: {‘6,63’: 1}, ‘3,4’: {‘4,4’: 1}, ‘63,4’: {‘4,4’: 1}, ‘4,4’: {‘4,3’: 1, ‘4,9’: 1}}

dont forget to also add the import for copy:

import copy

回答 24

from collections import defaultdict
from itertools import chain

class DictHelper:

@staticmethod
def merge_dictionaries(*dictionaries, override=True):
    merged_dict = defaultdict(set)
    all_unique_keys = set(chain(*[list(dictionary.keys()) for dictionary in dictionaries]))  # Build a set using all dict keys
    for key in all_unique_keys:
        keys_value_type = list(set(filter(lambda obj_type: obj_type != type(None), [type(dictionary.get(key, None)) for dictionary in dictionaries])))
        # Establish the object type for each key, return None if key is not present in dict and remove None from final result
        if len(keys_value_type) != 1:
            raise Exception("Different objects type for same key: {keys_value_type}".format(keys_value_type=keys_value_type))

        if keys_value_type[0] == list:
            values = list(chain(*[dictionary.get(key, []) for dictionary in dictionaries]))  # Extract the value for each key
            merged_dict[key].update(values)

        elif keys_value_type[0] == dict:
            # Extract all dictionaries by key and enter in recursion
            dicts_to_merge = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = DictHelper.merge_dictionaries(*dicts_to_merge)

        else:
            # if override => get value from last dictionary else make a list of all values
            values = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = values[-1] if override else values

    return dict(merged_dict)



if __name__ == '__main__':
  d1 = {'aaaaaaaaa': ['to short', 'to long'], 'bbbbb': ['to short', 'to long'], "cccccc": ["the is a test"]}
  d2 = {'aaaaaaaaa': ['field is not a bool'], 'bbbbb': ['field is not a bool']}
  d3 = {'aaaaaaaaa': ['filed is not a string', "to short"], 'bbbbb': ['field is not an integer']}
  print(DictHelper.merge_dictionaries(d1, d2, d3))

  d4 = {"a": {"x": 1, "y": 2, "z": 3, "d": {"x1": 10}}}
  d5 = {"a": {"x": 10, "y": 20, "d": {"x2": 20}}}
  print(DictHelper.merge_dictionaries(d4, d5))

输出:

{'bbbbb': {'to long', 'field is not an integer', 'to short', 'field is not a bool'}, 
'aaaaaaaaa': {'to long', 'to short', 'filed is not a string', 'field is not a bool'}, 
'cccccc': {'the is a test'}}

{'a': {'y': 20, 'd': {'x1': 10, 'x2': 20}, 'z': 3, 'x': 10}}
from collections import defaultdict
from itertools import chain

class DictHelper:

@staticmethod
def merge_dictionaries(*dictionaries, override=True):
    merged_dict = defaultdict(set)
    all_unique_keys = set(chain(*[list(dictionary.keys()) for dictionary in dictionaries]))  # Build a set using all dict keys
    for key in all_unique_keys:
        keys_value_type = list(set(filter(lambda obj_type: obj_type != type(None), [type(dictionary.get(key, None)) for dictionary in dictionaries])))
        # Establish the object type for each key, return None if key is not present in dict and remove None from final result
        if len(keys_value_type) != 1:
            raise Exception("Different objects type for same key: {keys_value_type}".format(keys_value_type=keys_value_type))

        if keys_value_type[0] == list:
            values = list(chain(*[dictionary.get(key, []) for dictionary in dictionaries]))  # Extract the value for each key
            merged_dict[key].update(values)

        elif keys_value_type[0] == dict:
            # Extract all dictionaries by key and enter in recursion
            dicts_to_merge = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = DictHelper.merge_dictionaries(*dicts_to_merge)

        else:
            # if override => get value from last dictionary else make a list of all values
            values = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = values[-1] if override else values

    return dict(merged_dict)



if __name__ == '__main__':
  d1 = {'aaaaaaaaa': ['to short', 'to long'], 'bbbbb': ['to short', 'to long'], "cccccc": ["the is a test"]}
  d2 = {'aaaaaaaaa': ['field is not a bool'], 'bbbbb': ['field is not a bool']}
  d3 = {'aaaaaaaaa': ['filed is not a string', "to short"], 'bbbbb': ['field is not an integer']}
  print(DictHelper.merge_dictionaries(d1, d2, d3))

  d4 = {"a": {"x": 1, "y": 2, "z": 3, "d": {"x1": 10}}}
  d5 = {"a": {"x": 10, "y": 20, "d": {"x2": 20}}}
  print(DictHelper.merge_dictionaries(d4, d5))

Output:

{'bbbbb': {'to long', 'field is not an integer', 'to short', 'field is not a bool'}, 
'aaaaaaaaa': {'to long', 'to short', 'filed is not a string', 'field is not a bool'}, 
'cccccc': {'the is a test'}}

{'a': {'y': 20, 'd': {'x1': 10, 'x2': 20}, 'z': 3, 'x': 10}}

回答 25

看一看toolz

import toolz
dict1={1:{"a":"A"},2:{"b":"B"}}
dict2={2:{"c":"C"},3:{"d":"D"}}
toolz.merge_with(toolz.merge,dict1,dict2)

{1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}

take a look at toolz package

import toolz
dict1={1:{"a":"A"},2:{"b":"B"}}
dict2={2:{"c":"C"},3:{"d":"D"}}
toolz.merge_with(toolz.merge,dict1,dict2)

gives

{1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}

回答 26

以下函数将b合并为a。

def mergedicts(a, b):
    for key in b:
        if isinstance(a.get(key), dict) or isinstance(b.get(key), dict):
            mergedicts(a[key], b[key])
        else:
            a[key] = b[key]
    return a

The following function merges b into a.

def mergedicts(a, b):
    for key in b:
        if isinstance(a.get(key), dict) or isinstance(b.get(key), dict):
            mergedicts(a[key], b[key])
        else:
            a[key] = b[key]
    return a

回答 27

还有另一个细微变化:

这是基于纯python3集的深度更新功能。它通过一次遍历一个级别来更新嵌套字典,并调用自身来更新每个下一级别的字典值:

def deep_update(dict_original, dict_update):
    if isinstance(dict_original, dict) and isinstance(dict_update, dict):
        output=dict(dict_original)
        keys_original=set(dict_original.keys())
        keys_update=set(dict_update.keys())
        similar_keys=keys_original.intersection(keys_update)
        similar_dict={key:deep_update(dict_original[key], dict_update[key]) for key in similar_keys}
        new_keys=keys_update.difference(keys_original)
        new_dict={key:dict_update[key] for key in new_keys}
        output.update(similar_dict)
        output.update(new_dict)
        return output
    else:
        return dict_update

一个简单的例子:

x={'a':{'b':{'c':1, 'd':1}}}
y={'a':{'b':{'d':2, 'e':2}}, 'f':2}

print(deep_update(x, y))
>>> {'a': {'b': {'c': 1, 'd': 2, 'e': 2}}, 'f': 2}

And just another slight variation:

Here is a pure python3 set based deep update function. It updates nested dictionaries by looping through one level at a time and calls itself to update each next level of dictionary values:

def deep_update(dict_original, dict_update):
    if isinstance(dict_original, dict) and isinstance(dict_update, dict):
        output=dict(dict_original)
        keys_original=set(dict_original.keys())
        keys_update=set(dict_update.keys())
        similar_keys=keys_original.intersection(keys_update)
        similar_dict={key:deep_update(dict_original[key], dict_update[key]) for key in similar_keys}
        new_keys=keys_update.difference(keys_original)
        new_dict={key:dict_update[key] for key in new_keys}
        output.update(similar_dict)
        output.update(new_dict)
        return output
    else:
        return dict_update

A simple example:

x={'a':{'b':{'c':1, 'd':1}}}
y={'a':{'b':{'d':2, 'e':2}}, 'f':2}

print(deep_update(x, y))
>>> {'a': {'b': {'c': 1, 'd': 2, 'e': 2}}, 'f': 2}

回答 28

另一个答案怎么样?!这也避免了突变/副作用:

def merge(dict1, dict2):
    output = {}

    # adds keys from `dict1` if they do not exist in `dict2` and vice-versa
    intersection = {**dict2, **dict1}

    for k_intersect, v_intersect in intersection.items():
        if k_intersect not in dict1:
            v_dict2 = dict2[k_intersect]
            output[k_intersect] = v_dict2

        elif k_intersect not in dict2:
            output[k_intersect] = v_intersect

        elif isinstance(v_intersect, dict):
            v_dict2 = dict2[k_intersect]
            output[k_intersect] = merge(v_intersect, v_dict2)

        else:
            output[k_intersect] = v_intersect

    return output
dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}
dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}
dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}

assert dict3 == merge(dict1, dict2)

How about another answer?!? This one also avoids mutation/side effects:

def merge(dict1, dict2):
    output = {}

    # adds keys from `dict1` if they do not exist in `dict2` and vice-versa
    intersection = {**dict2, **dict1}

    for k_intersect, v_intersect in intersection.items():
        if k_intersect not in dict1:
            v_dict2 = dict2[k_intersect]
            output[k_intersect] = v_dict2

        elif k_intersect not in dict2:
            output[k_intersect] = v_intersect

        elif isinstance(v_intersect, dict):
            v_dict2 = dict2[k_intersect]
            output[k_intersect] = merge(v_intersect, v_dict2)

        else:
            output[k_intersect] = v_intersect

    return output

dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}
dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}
dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}

assert dict3 == merge(dict1, dict2)

在Python中,如何将YAML映射加载为OrderedDicts?

问题:在Python中,如何将YAML映射加载为OrderedDicts?

我想让PyYAML的加载器将映射(和有序映射)加载到Python 2.7+ OrderedDict类型中,而不是dict它当前使用的普通和​​对列表。

最好的方法是什么?

I’d like to get PyYAML‘s loader to load mappings (and ordered mappings) into the Python 2.7+ OrderedDict type, instead of the vanilla dict and the list of pairs it currently uses.

What’s the best way to do that?


回答 0

更新:在python 3.6+中OrderedDict,由于新的dict实现已经在pypy中使用了一段时间(尽管现在考虑了CPython实现的细节),您可能根本不需要。

更新:在python 3.7+中,dict对象的插入顺序保留性质已声明为Python语言规范的正式组成部分,请参阅Python 3.7的新增功能

我喜欢@James的简单解决方案。但是,它更改了默认的全局yaml.Loader类,这可能导致麻烦的副作用。特别是在编写库代码时,这是一个坏主意。另外,它不能直接与使用yaml.safe_load()

幸运的是,无需付出太多努力即可改进该解决方案:

import yaml
from collections import OrderedDict

def ordered_load(stream, Loader=yaml.Loader, object_pairs_hook=OrderedDict):
    class OrderedLoader(Loader):
        pass
    def construct_mapping(loader, node):
        loader.flatten_mapping(node)
        return object_pairs_hook(loader.construct_pairs(node))
    OrderedLoader.add_constructor(
        yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
        construct_mapping)
    return yaml.load(stream, OrderedLoader)

# usage example:
ordered_load(stream, yaml.SafeLoader)

对于序列化,我不知道明显的概括,但是至少这应该没有任何副作用:

def ordered_dump(data, stream=None, Dumper=yaml.Dumper, **kwds):
    class OrderedDumper(Dumper):
        pass
    def _dict_representer(dumper, data):
        return dumper.represent_mapping(
            yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
            data.items())
    OrderedDumper.add_representer(OrderedDict, _dict_representer)
    return yaml.dump(data, stream, OrderedDumper, **kwds)

# usage:
ordered_dump(data, Dumper=yaml.SafeDumper)

Update: In python 3.6+ you probably don’t need OrderedDict at all due to the new dict implementation that has been in use in pypy for some time (although considered CPython implementation detail for now).

Update: In python 3.7+, the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec, see What’s New In Python 3.7.

I like @James’ solution for its simplicity. However, it changes the default global yaml.Loader class, which can lead to troublesome side effects. Especially, when writing library code this is a bad idea. Also, it doesn’t directly work with yaml.safe_load().

Fortunately, the solution can be improved without much effort:

import yaml
from collections import OrderedDict

def ordered_load(stream, Loader=yaml.Loader, object_pairs_hook=OrderedDict):
    class OrderedLoader(Loader):
        pass
    def construct_mapping(loader, node):
        loader.flatten_mapping(node)
        return object_pairs_hook(loader.construct_pairs(node))
    OrderedLoader.add_constructor(
        yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
        construct_mapping)
    return yaml.load(stream, OrderedLoader)

# usage example:
ordered_load(stream, yaml.SafeLoader)

For serialization, I don’t know an obvious generalization, but at least this shouldn’t have any side effects:

def ordered_dump(data, stream=None, Dumper=yaml.Dumper, **kwds):
    class OrderedDumper(Dumper):
        pass
    def _dict_representer(dumper, data):
        return dumper.represent_mapping(
            yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
            data.items())
    OrderedDumper.add_representer(OrderedDict, _dict_representer)
    return yaml.dump(data, stream, OrderedDumper, **kwds)

# usage:
ordered_dump(data, Dumper=yaml.SafeDumper)

回答 1

yaml模块允许您指定自定义“表示器”以将Python对象转换为文本,并指定“构造器”以逆转该过程。

_mapping_tag = yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG

def dict_representer(dumper, data):
    return dumper.represent_dict(data.iteritems())

def dict_constructor(loader, node):
    return collections.OrderedDict(loader.construct_pairs(node))

yaml.add_representer(collections.OrderedDict, dict_representer)
yaml.add_constructor(_mapping_tag, dict_constructor)

The yaml module allow you to specify custom ‘representers’ to convert Python objects to text and ‘constructors’ to reverse the process.

_mapping_tag = yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG

def dict_representer(dumper, data):
    return dumper.represent_dict(data.iteritems())

def dict_constructor(loader, node):
    return collections.OrderedDict(loader.construct_pairs(node))

yaml.add_representer(collections.OrderedDict, dict_representer)
yaml.add_constructor(_mapping_tag, dict_constructor)

回答 2

2018年选项:

oyamlPyYAML的直接替代品,保留了字典排序。同时支持Python 2和Python 3。只需pip install oyaml导入,如下所示:

import oyaml as yaml

在转储/加载时,您将不再为搞砸的映射而烦恼。

注意:我是oyaml的作者。

2018 option:

oyaml is a drop-in replacement for PyYAML which preserves dict ordering. Both Python 2 and Python 3 are supported. Just pip install oyaml, and import as shown below:

import oyaml as yaml

You’ll no longer be annoyed by screwed-up mappings when dumping/loading.

Note: I’m the author of oyaml.


回答 3

2015(及更高版本)选项:

ruamel.yaml是PyYAML的替代品(免责声明:我是该软件包的作者)。保留映射的顺序是2015年在第一版(0.1)中添加的内容之一。它不仅保留字典的顺序,还保留注释,锚点名称,标签并支持YAML 1.2规范(2009年发布)

规范说,不能保证排序,但是YAML文件中当然有排序,并且适当的解析器可以仅保留该排序器,并透明地生成一个保持排序的对象。您只需要选择正确的解析器,加载器和转储器¹:

import sys
from ruamel.yaml import YAML

yaml_str = """\
3: abc
conf:
    10: def
    3: gij     # h is missing
more:
- what
- else
"""

yaml = YAML()
data = yaml.load(yaml_str)
data['conf'][10] = 'klm'
data['conf'][3] = 'jig'
yaml.dump(data, sys.stdout)

会给你:

3: abc
conf:
  10: klm
  3: jig       # h is missing
more:
- what
- else

dataCommentedMap类似dict 的类型,但具有额外的信息,这些信息会一直保留直到被转储(包括保留的注释!)

2015 (and later) option:

ruamel.yaml is a drop in replacement for PyYAML (disclaimer: I am the author of that package). Preserving the order of the mappings was one of the things added in the first version (0.1) back in 2015. Not only does it preserve the order of your dictionaries, it will also preserve comments, anchor names, tags and does support the YAML 1.2 specification (released 2009)

The specification says that the ordering is not guaranteed, but of course there is ordering in the YAML file and the appropriate parser can just hold on to that and transparently generate an object that keeps the ordering. You just need to choose the right parser, loader and dumper¹:

import sys
from ruamel.yaml import YAML

yaml_str = """\
3: abc
conf:
    10: def
    3: gij     # h is missing
more:
- what
- else
"""

yaml = YAML()
data = yaml.load(yaml_str)
data['conf'][10] = 'klm'
data['conf'][3] = 'jig'
yaml.dump(data, sys.stdout)

will give you:

3: abc
conf:
  10: klm
  3: jig       # h is missing
more:
- what
- else

data is of type CommentedMap which functions like a dict, but has extra information that is kept around until being dumped (including the preserved comment!)


回答 4

注意:有一个基于以下答案的库,该库还实现了CLoader和CDumpers:Phynix / yamlloader

我非常怀疑这是最好的方法,但这是我想出的方法,并且确实有效。也可作为要点

import yaml
import yaml.constructor

try:
    # included in standard lib from Python 2.7
    from collections import OrderedDict
except ImportError:
    # try importing the backported drop-in replacement
    # it's available on PyPI
    from ordereddict import OrderedDict

class OrderedDictYAMLLoader(yaml.Loader):
    """
    A YAML loader that loads mappings into ordered dictionaries.
    """

    def __init__(self, *args, **kwargs):
        yaml.Loader.__init__(self, *args, **kwargs)

        self.add_constructor(u'tag:yaml.org,2002:map', type(self).construct_yaml_map)
        self.add_constructor(u'tag:yaml.org,2002:omap', type(self).construct_yaml_map)

    def construct_yaml_map(self, node):
        data = OrderedDict()
        yield data
        value = self.construct_mapping(node)
        data.update(value)

    def construct_mapping(self, node, deep=False):
        if isinstance(node, yaml.MappingNode):
            self.flatten_mapping(node)
        else:
            raise yaml.constructor.ConstructorError(None, None,
                'expected a mapping node, but found %s' % node.id, node.start_mark)

        mapping = OrderedDict()
        for key_node, value_node in node.value:
            key = self.construct_object(key_node, deep=deep)
            try:
                hash(key)
            except TypeError, exc:
                raise yaml.constructor.ConstructorError('while constructing a mapping',
                    node.start_mark, 'found unacceptable key (%s)' % exc, key_node.start_mark)
            value = self.construct_object(value_node, deep=deep)
            mapping[key] = value
        return mapping

Note: there is a library, based on the following answer, which implements also the CLoader and CDumpers: Phynix/yamlloader

I doubt very much that this is the best way to do it, but this is the way I came up with, and it does work. Also available as a gist.

import yaml
import yaml.constructor

try:
    # included in standard lib from Python 2.7
    from collections import OrderedDict
except ImportError:
    # try importing the backported drop-in replacement
    # it's available on PyPI
    from ordereddict import OrderedDict

class OrderedDictYAMLLoader(yaml.Loader):
    """
    A YAML loader that loads mappings into ordered dictionaries.
    """

    def __init__(self, *args, **kwargs):
        yaml.Loader.__init__(self, *args, **kwargs)

        self.add_constructor(u'tag:yaml.org,2002:map', type(self).construct_yaml_map)
        self.add_constructor(u'tag:yaml.org,2002:omap', type(self).construct_yaml_map)

    def construct_yaml_map(self, node):
        data = OrderedDict()
        yield data
        value = self.construct_mapping(node)
        data.update(value)

    def construct_mapping(self, node, deep=False):
        if isinstance(node, yaml.MappingNode):
            self.flatten_mapping(node)
        else:
            raise yaml.constructor.ConstructorError(None, None,
                'expected a mapping node, but found %s' % node.id, node.start_mark)

        mapping = OrderedDict()
        for key_node, value_node in node.value:
            key = self.construct_object(key_node, deep=deep)
            try:
                hash(key)
            except TypeError, exc:
                raise yaml.constructor.ConstructorError('while constructing a mapping',
                    node.start_mark, 'found unacceptable key (%s)' % exc, key_node.start_mark)
            value = self.construct_object(value_node, deep=deep)
            mapping[key] = value
        return mapping

回答 5

更新:不赞成使用该库,而推荐使用yamlloader(它基于yamlordereddictloader)

我刚刚找到了一个Python库(https://pypi.python.org/pypi/yamlordereddictloader/0.1.1),该库是基于此问题的答案而创建的,使用起来非常简单:

import yaml
import yamlordereddictloader

datas = yaml.load(open('myfile.yml'), Loader=yamlordereddictloader.Loader)

Update: the library was deprecated in favor of the yamlloader (which is based on the yamlordereddictloader)

I’ve just found a Python library (https://pypi.python.org/pypi/yamlordereddictloader/0.1.1) which was created based on answers to this question and is quite simple to use:

import yaml
import yamlordereddictloader

datas = yaml.load(open('myfile.yml'), Loader=yamlordereddictloader.Loader)

回答 6

在针对Python 2.7的For PyYaml安装中,我更新了__init __。py,constructor.py和loader.py。现在支持用于加载命令的object_pairs_hook选项。我所做的更改差异如下。

__init__.py

$ diff __init__.py Original
64c64
< def load(stream, Loader=Loader, **kwds):
---
> def load(stream, Loader=Loader):
69c69
<     loader = Loader(stream, **kwds)
---
>     loader = Loader(stream)
75c75
< def load_all(stream, Loader=Loader, **kwds):
---
> def load_all(stream, Loader=Loader):
80c80
<     loader = Loader(stream, **kwds)
---
>     loader = Loader(stream)

constructor.py

$ diff constructor.py Original
20,21c20
<     def __init__(self, object_pairs_hook=dict):
<         self.object_pairs_hook = object_pairs_hook
---
>     def __init__(self):
27,29d25
<     def create_object_hook(self):
<         return self.object_pairs_hook()
<
54,55c50,51
<         self.constructed_objects = self.create_object_hook()
<         self.recursive_objects = self.create_object_hook()
---
>         self.constructed_objects = {}
>         self.recursive_objects = {}
129c125
<         mapping = self.create_object_hook()
---
>         mapping = {}
400c396
<         data = self.create_object_hook()
---
>         data = {}
595c591
<             dictitems = self.create_object_hook()
---
>             dictitems = {}
602c598
<             dictitems = value.get('dictitems', self.create_object_hook())
---
>             dictitems = value.get('dictitems', {})

loader.py

$ diff loader.py Original
13c13
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
18c18
<         BaseConstructor.__init__(self, **constructKwds)
---
>         BaseConstructor.__init__(self)
23c23
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
28c28
<         SafeConstructor.__init__(self, **constructKwds)
---
>         SafeConstructor.__init__(self)
33c33
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
38c38
<         Constructor.__init__(self, **constructKwds)
---
>         Constructor.__init__(self)

On my For PyYaml installation for Python 2.7 I updated __init__.py, constructor.py, and loader.py. Now supports object_pairs_hook option for load commands. Diff of changes I made is below.

__init__.py

$ diff __init__.py Original
64c64
< def load(stream, Loader=Loader, **kwds):
---
> def load(stream, Loader=Loader):
69c69
<     loader = Loader(stream, **kwds)
---
>     loader = Loader(stream)
75c75
< def load_all(stream, Loader=Loader, **kwds):
---
> def load_all(stream, Loader=Loader):
80c80
<     loader = Loader(stream, **kwds)
---
>     loader = Loader(stream)

constructor.py

$ diff constructor.py Original
20,21c20
<     def __init__(self, object_pairs_hook=dict):
<         self.object_pairs_hook = object_pairs_hook
---
>     def __init__(self):
27,29d25
<     def create_object_hook(self):
<         return self.object_pairs_hook()
<
54,55c50,51
<         self.constructed_objects = self.create_object_hook()
<         self.recursive_objects = self.create_object_hook()
---
>         self.constructed_objects = {}
>         self.recursive_objects = {}
129c125
<         mapping = self.create_object_hook()
---
>         mapping = {}
400c396
<         data = self.create_object_hook()
---
>         data = {}
595c591
<             dictitems = self.create_object_hook()
---
>             dictitems = {}
602c598
<             dictitems = value.get('dictitems', self.create_object_hook())
---
>             dictitems = value.get('dictitems', {})

loader.py

$ diff loader.py Original
13c13
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
18c18
<         BaseConstructor.__init__(self, **constructKwds)
---
>         BaseConstructor.__init__(self)
23c23
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
28c28
<         SafeConstructor.__init__(self, **constructKwds)
---
>         SafeConstructor.__init__(self)
33c33
<     def __init__(self, stream, **constructKwds):
---
>     def __init__(self, stream):
38c38
<         Constructor.__init__(self, **constructKwds)
---
>         Constructor.__init__(self)

回答 7

这是一个简单的解决方案,还可以检查地图中是否有重复的顶级键。

import yaml
import re
from collections import OrderedDict

def yaml_load_od(fname):
    "load a yaml file as an OrderedDict"
    # detects any duped keys (fail on this) and preserves order of top level keys
    with open(fname, 'r') as f:
        lines = open(fname, "r").read().splitlines()
        top_keys = []
        duped_keys = []
        for line in lines:
            m = re.search(r'^([A-Za-z0-9_]+) *:', line)
            if m:
                if m.group(1) in top_keys:
                    duped_keys.append(m.group(1))
                else:
                    top_keys.append(m.group(1))
        if duped_keys:
            raise Exception('ERROR: duplicate keys: {}'.format(duped_keys))
    # 2nd pass to set up the OrderedDict
    with open(fname, 'r') as f:
        d_tmp = yaml.load(f)
    return OrderedDict([(key, d_tmp[key]) for key in top_keys])

here’s a simple solution that also checks for duplicated top level keys in your map.

import yaml
import re
from collections import OrderedDict

def yaml_load_od(fname):
    "load a yaml file as an OrderedDict"
    # detects any duped keys (fail on this) and preserves order of top level keys
    with open(fname, 'r') as f:
        lines = open(fname, "r").read().splitlines()
        top_keys = []
        duped_keys = []
        for line in lines:
            m = re.search(r'^([A-Za-z0-9_]+) *:', line)
            if m:
                if m.group(1) in top_keys:
                    duped_keys.append(m.group(1))
                else:
                    top_keys.append(m.group(1))
        if duped_keys:
            raise Exception('ERROR: duplicate keys: {}'.format(duped_keys))
    # 2nd pass to set up the OrderedDict
    with open(fname, 'r') as f:
        d_tmp = yaml.load(f)
    return OrderedDict([(key, d_tmp[key]) for key in top_keys])

如何在Django模板的词典中遍历词典?

问题:如何在Django模板的词典中遍历词典?

我的字典看起来像这样(字典中的字典):

{'0': {
    'chosen_unit': <Unit: Kg>,
    'cost': Decimal('10.0000'),
    'unit__name_abbrev': u'G',
    'supplier__supplier': u"Steve's Meat Locker",
    'price': Decimal('5.00'),
    'supplier__address': u'No\r\naddress here',
    'chosen_unit_amount': u'2',
    'city__name': u'Joburg, Central',
    'supplier__phone_number': u'02299944444',
    'supplier__website': None,
    'supplier__price_list': u'',
    'supplier__email': u'ss.sss@ssssss.com',
    'unit__name': u'Gram',
    'name': u'Rump Bone',
}}

现在,我只是想在模板上显示信息,但是我很挣扎。我的模板代码如下:

{% if landing_dict.ingredients %}
  <hr>
  {% for ingredient in landing_dict.ingredients %}
    {{ ingredient }}
  {% endfor %}
  <a href="/">Print {{ landing_dict.recipe_name }}</a>
{% else %}
  Please search for an ingredient below
{% endif %}

它只是在模板上显示“ 0”?

我也尝试过:

{% for ingredient in landing_dict.ingredients %}
  {{ ingredient.cost }}
{% endfor %}

这甚至不显示结果。

我认为也许我需要更深一层地迭代,所以尝试了一下:

{% if landing_dict.ingredients %}
  <hr>
  {% for ingredient in landing_dict.ingredients %}
    {% for field in ingredient %}
      {{ field }}
    {% endfor %}
  {% endfor %}
  <a href="/">Print {{ landing_dict.recipe_name }}</a>
{% else %}
  Please search for an ingredient below
{% endif %}

但这不会显示任何内容。

我究竟做错了什么?

My dictionary looks like this(Dictionary within a dictionary):

{'0': {
    'chosen_unit': <Unit: Kg>,
    'cost': Decimal('10.0000'),
    'unit__name_abbrev': u'G',
    'supplier__supplier': u"Steve's Meat Locker",
    'price': Decimal('5.00'),
    'supplier__address': u'No\r\naddress here',
    'chosen_unit_amount': u'2',
    'city__name': u'Joburg, Central',
    'supplier__phone_number': u'02299944444',
    'supplier__website': None,
    'supplier__price_list': u'',
    'supplier__email': u'ss.sss@ssssss.com',
    'unit__name': u'Gram',
    'name': u'Rump Bone',
}}

Now I’m just trying to display the information on my template but I’m struggling. My code for the template looks like:

{% if landing_dict.ingredients %}
  <hr>
  {% for ingredient in landing_dict.ingredients %}
    {{ ingredient }}
  {% endfor %}
  <a href="/">Print {{ landing_dict.recipe_name }}</a>
{% else %}
  Please search for an ingredient below
{% endif %}

It just shows me ‘0’ on my template?

I also tried:

{% for ingredient in landing_dict.ingredients %}
  {{ ingredient.cost }}
{% endfor %}

This doesn’t even display a result.

I thought perhaps I need to iterate one level deeper so tried this:

{% if landing_dict.ingredients %}
  <hr>
  {% for ingredient in landing_dict.ingredients %}
    {% for field in ingredient %}
      {{ field }}
    {% endfor %}
  {% endfor %}
  <a href="/">Print {{ landing_dict.recipe_name }}</a>
{% else %}
  Please search for an ingredient below
{% endif %}

But this doesn’t display anything.

What am I doing wrong?


回答 0

可以说您的数据是-

data = {'a': [ [1, 2] ], 'b': [ [3, 4] ],'c':[ [5,6]] }

您可以使用该data.items()方法来获取字典元素。请注意,在Django模板中,我们不放置()。另外提到的一些用户values[0]不起作用,如果是这种情况,请尝试values.items

<table>
    <tr>
        <td>a</td>
        <td>b</td>
        <td>c</td>
    </tr>

    {% for key, values in data.items %}
    <tr>
        <td>{{key}}</td>
        {% for v in values[0] %}
        <td>{{v}}</td>
        {% endfor %}
    </tr>
    {% endfor %}
</table>

相当确定您可以将此逻辑扩展到您的特定字典。


要以排序的顺序遍历dict键 -首先,我们在python中排序,然后在django模板中进行迭代和渲染。

return render_to_response('some_page.html', {'data': sorted(data.items())})

在模板文件中:

{% for key, value in data %}
    <tr>
        <td> Key: {{ key }} </td> 
        <td> Value: {{ value }} </td>
    </tr>
{% endfor %}

Lets say your data is –

data = {'a': [ [1, 2] ], 'b': [ [3, 4] ],'c':[ [5,6]] }

You can use the data.items() method to get the dictionary elements. Note, in django templates we do NOT put (). Also some users mentioned values[0] does not work, if that is the case then try values.items.

<table>
    <tr>
        <td>a</td>
        <td>b</td>
        <td>c</td>
    </tr>

    {% for key, values in data.items %}
    <tr>
        <td>{{key}}</td>
        {% for v in values[0] %}
        <td>{{v}}</td>
        {% endfor %}
    </tr>
    {% endfor %}
</table>

Am pretty sure you can extend this logic to your specific dict.


To iterate over dict keys in a sorted order – First we sort in python then iterate & render in django template.

return render_to_response('some_page.html', {'data': sorted(data.items())})

In template file:

{% for key, value in data %}
    <tr>
        <td> Key: {{ key }} </td> 
        <td> Value: {{ value }} </td>
    </tr>
{% endfor %}

回答 1

这个答案对我没有用,但是我自己找到了答案。但是,没有人发布我的问题。我懒得先问再回答,所以就把它放在这里。

这用于以下查询:

data = Leaderboard.objects.filter(id=custom_user.id).values(
    'value1',
    'value2',
    'value3')

在模板中:

{% for dictionary in data %}
  {% for key, value in dictionary.items %}
    <p>{{ key }} : {{ value }}</p>
  {% endfor %}
{% endfor %}

This answer didn’t work for me, but I found the answer myself. No one, however, has posted my question. I’m too lazy to ask it and then answer it, so will just put it here.

This is for the following query:

data = Leaderboard.objects.filter(id=custom_user.id).values(
    'value1',
    'value2',
    'value3')

In template:

{% for dictionary in data %}
  {% for key, value in dictionary.items %}
    <p>{{ key }} : {{ value }}</p>
  {% endfor %}
{% endfor %}

回答 2

如果将变量data(字典类型)作为上下文传递给模板,则代码应为:

{% for key, value in data.items %}
    <p>{{ key }} : {{ value }}</p> 
{% endfor %}

If you pass a variable data (dictionary type) as context to a template, then you code should be:

{% for key, value in data.items %}
    <p>{{ key }} : {{ value }}</p> 
{% endfor %}