计算两个Python字典中包含的键的差异

问题:计算两个Python字典中包含的键的差异

假设我有两个Python字典- dictAdictB。我需要找出是否有任何键存在于中,dictB但没有dictA。最快的方法是什么?

我应该将字典键转换为集合然后继续吗?

有兴趣了解您的想法…


感谢您的回复。

很抱歉未能正确说明我的问题。我的情况是这样的-我有一个dictA与可能相同的dictB密钥,或者可能缺少一些密钥,dictB否则某些密钥的值可能会有所不同,必须将其设置为dictA密钥的值。

问题在于字典没有标准,并且可以具有可以作为dict的值。

dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......

因此,必须将“ key2”值重置为新值,并在字典内部添加“ key13”。键值没有固定的格式。它可以是一个简单的值或dict或dict的dict。

Suppose I have two Python dictionaries – dictA and dictB. I need to find out if there are any keys which are present in dictB but not in dictA. What is the fastest way to go about it?

Should I convert the dictionary keys into a set and then go about?

Interested in knowing your thoughts…


Thanks for your responses.

Apologies for not stating my question properly. My scenario is like this – I have a dictA which can be the same as dictB or may have some keys missing as compared to dictB or else the value of some keys might be different which has to be set to that of dictA key’s value.

Problem is the dictionary has no standard and can have values which can be dict of dict.

Say

dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......

So ‘key2’ value has to be reset to the new value and ‘key13’ has to be added inside the dict. The key value does not have a fixed format. It can be a simple value or a dict or a dict of dict.


回答 0

您可以在按键上使用设置操作:

diff = set(dictb.keys()) - set(dicta.keys())

这是一个查找所有可能性的类:添加了什么,删除了什么,哪些键值对相同​​以及哪些键值对已更改。

class DictDiffer(object):
    """
    Calculate the difference between two dictionaries as:
    (1) items added
    (2) items removed
    (3) keys same in both but changed values
    (4) keys same in both and unchanged values
    """
    def __init__(self, current_dict, past_dict):
        self.current_dict, self.past_dict = current_dict, past_dict
        self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
        self.intersect = self.set_current.intersection(self.set_past)
    def added(self):
        return self.set_current - self.intersect 
    def removed(self):
        return self.set_past - self.intersect 
    def changed(self):
        return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
    def unchanged(self):
        return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])

这是一些示例输出:

>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])

可以作为github存储库使用:https : //github.com/hughdbrown/dictdiffer

You can use set operations on the keys:

diff = set(dictb.keys()) - set(dicta.keys())

Here is a class to find all the possibilities: what was added, what was removed, which key-value pairs are the same, and which key-value pairs are changed.

class DictDiffer(object):
    """
    Calculate the difference between two dictionaries as:
    (1) items added
    (2) items removed
    (3) keys same in both but changed values
    (4) keys same in both and unchanged values
    """
    def __init__(self, current_dict, past_dict):
        self.current_dict, self.past_dict = current_dict, past_dict
        self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
        self.intersect = self.set_current.intersection(self.set_past)
    def added(self):
        return self.set_current - self.intersect 
    def removed(self):
        return self.set_past - self.intersect 
    def changed(self):
        return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
    def unchanged(self):
        return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])

Here is some sample output:

>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])

Available as a github repo: https://github.com/hughdbrown/dictdiffer


回答 1

如果您需要递归的区别,我已经为python编写了一个软件包:https : //github.com/seperman/deepdiff

安装

从PyPi安装:

pip install deepdiff

用法示例

输入

>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2

同一对象返回空

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}

项目类型已更改

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
                                 'newvalue': '2',
                                 'oldtype': <class 'int'>,
                                 'oldvalue': 2}}}

项目的价值已更改

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

添加和/或删除项目

>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
 'dic_item_removed': ['root[4]'],
 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

弦差异

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
                      "root[4]['b']": { 'newvalue': 'world!',
                                        'oldvalue': 'world'}}}

弦差异2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
                                                '+++ \n'
                                                '@@ -1,5 +1,4 @@\n'
                                                '-world!\n'
                                                '-Goodbye!\n'
                                                '+world\n'
                                                ' 1\n'
                                                ' 2\n'
                                                ' End',
                                        'newvalue': 'world\n1\n2\nEnd',
                                        'oldvalue': 'world!\n'
                                                    'Goodbye!\n'
                                                    '1\n'
                                                    '2\n'
                                                    'End'}}}

>>> 
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
--- 
+++ 
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
 1
 2
 End

类型变更

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
                                      'newvalue': 'world\n\n\nEnd',
                                      'oldtype': <class 'list'>,
                                      'oldvalue': [1, 2, 3]}}}

清单差异

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}

清单差异2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
  'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
                      "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}

列出差异忽略顺序或重复项:(具有与上述相同的字典)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}

包含字典的列表:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
  'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}

套装:

>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}

命名元组:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}

自定义对象:

>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
... 
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>> 
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

添加对象属性:

>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

In case you want the difference recursively, I have written a package for python: https://github.com/seperman/deepdiff

Installation

Install from PyPi:

pip install deepdiff

Example usage

Importing

>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2

Same object returns empty

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}

Type of an item has changed

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
                                 'newvalue': '2',
                                 'oldtype': <class 'int'>,
                                 'oldvalue': 2}}}

Value of an item has changed

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

Item added and/or removed

>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
 'dic_item_removed': ['root[4]'],
 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

String difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
                      "root[4]['b']": { 'newvalue': 'world!',
                                        'oldvalue': 'world'}}}

String difference 2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
                                                '+++ \n'
                                                '@@ -1,5 +1,4 @@\n'
                                                '-world!\n'
                                                '-Goodbye!\n'
                                                '+world\n'
                                                ' 1\n'
                                                ' 2\n'
                                                ' End',
                                        'newvalue': 'world\n1\n2\nEnd',
                                        'oldvalue': 'world!\n'
                                                    'Goodbye!\n'
                                                    '1\n'
                                                    '2\n'
                                                    'End'}}}

>>> 
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
--- 
+++ 
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
 1
 2
 End

Type change

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
                                      'newvalue': 'world\n\n\nEnd',
                                      'oldtype': <class 'list'>,
                                      'oldvalue': [1, 2, 3]}}}

List difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}

List difference 2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
  'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
                      "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}

List difference ignoring order or duplicates: (with the same dictionaries as above)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}

List that contains dictionary:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
  'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}

Sets:

>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}

Named Tuples:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}

Custom objects:

>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
... 
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>> 
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

Object attribute added:

>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

回答 2

不知道它是否“快速”,但是通常情况下,可以做到这一点

dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
    if not key in dictb:
        print key

not sure whether its “fast” or not, but normally, one can do this

dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
    if not key in dictb:
        print key

回答 3

就像Alex Martelli所写的那样,如果您只想检查B中的任何键是否不在A中,那any(True for k in dictB if k not in dictA)将是您的最佳选择。

要查找缺少的密钥:

diff = set(dictB)-set(dictA) #sets

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =    
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop

diff = [ k for k in dictB if k not in dictA ] #lc

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA = 
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop

因此,这两种解决方案的速度几乎相同。

As Alex Martelli wrote, if you simply want to check if any key in B is not in A, any(True for k in dictB if k not in dictA) would be the way to go.

To find the keys that are missing:

diff = set(dictB)-set(dictA) #sets

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =    
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop

diff = [ k for k in dictB if k not in dictA ] #lc

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA = 
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop

So those two solutions are pretty much the same speed.


回答 4

如果您确实要说的是真的(您只需要找出B中而不是A中“有任何键”的情况,那么ONES可能没有),最快的方法应该是:

if any(True for k in dictB if k not in dictA): ...

如果您实际上需要找出哪个键(如果有)在B中而不是在A中,而不仅仅是“ IF”,那么有这样的键,那么现有的答案就很合适了(但是我建议在以后的问题中更精确一些,如果那是确实是您的意思;-)。

If you really mean exactly what you say (that you only need to find out IF “there are any keys” in B and not in A, not WHICH ONES might those be if any), the fastest way should be:

if any(True for k in dictB if k not in dictA): ...

If you actually need to find out WHICH KEYS, if any, are in B and not in A, and not just “IF” there are such keys, then existing answers are quite appropriate (but I do suggest more precision in future questions if that’s indeed what you mean;-).


回答 5

用途set()

set(dictA.keys()).intersection(dictB.keys())

Use set():

set(dictA.keys()).intersection(dictB.keys())

回答 6

hughdbrown的最高答案是建议使用集差异,这绝对是最好的方法:

diff = set(dictb.keys()) - set(dicta.keys())

这段代码的问题在于,它仅创建两个列表就创建了两个列表,因此浪费了4N的时间和2N的空间。它也比需要的要复杂一些。

通常,这没什么大不了的,但是如果是这样的话:

diff = dictb.keys() - dicta

Python 2

在Python 2中,keys()返回键列表,而不是KeysView。因此,您必须viewkeys()直接提出要求。

diff = dictb.viewkeys() - dicta

对于双版本2.7 / 3.x代码,希望使用six或类似的代码,因此可以使用six.viewkeys(dictb)

diff = six.viewkeys(dictb) - dicta

在2.4-2.6中,没有KeysView。但是,您可以直接从迭代器中构建左集合,而不是先构建列表,至少可以将成本从4N削减到N:

diff = set(dictb) - dicta

物品

我有一个dictA可以与dictB相同,或者与dictB相比可能缺少一些键,否则某些键的值可能不同

因此,您实际上不需要比较键,而是需要比较项。ItemsViewSet当值是可哈希值(例如字符串)时,an 才是a 。如果是这样,这很容易:

diff = dictb.items() - dicta.items()

递归差异

尽管问题不是直接要求递归差异,但某些示例值是dict,并且看来预期的输出确实递归地对它们进行差异。这里已经有多个答案显示了如何执行此操作。

The top answer by hughdbrown suggests using set difference, which is definitely the best approach:

diff = set(dictb.keys()) - set(dicta.keys())

The problem with this code is that it builds two lists just to create two sets, so it’s wasting 4N time and 2N space. It’s also a bit more complicated than it needs to be.

Usually, this is not a big deal, but if it is:

diff = dictb.keys() - dicta

Python 2

In Python 2, keys() returns a list of the keys, not a KeysView. So you have to ask for viewkeys() directly.

diff = dictb.viewkeys() - dicta

For dual-version 2.7/3.x code, you’re hopefully using six or something similar, so you can use six.viewkeys(dictb):

diff = six.viewkeys(dictb) - dicta

In 2.4-2.6, there is no KeysView. But you can at least cut the cost from 4N to N by building your left set directly out of an iterator, instead of building a list first:

diff = set(dictb) - dicta

Items

I have a dictA which can be the same as dictB or may have some keys missing as compared to dictB or else the value of some keys might be different

So you really don’t need to compare the keys, but the items. An ItemsView is only a Set if the values are hashable, like strings. If they are, it’s easy:

diff = dictb.items() - dicta.items()

Recursive diff

Although the question isn’t directly asking for a recursive diff, some of the example values are dicts, and it appears the expected output does recursively diff them. There are already multiple answers here showing how to do that.


回答 7

关于此参数,stackoverflow中还有另一个问题,我不得不承认有一个简单的解决方案:python 的datadiff库有助于打印两个字典之间的差异。

There is an other question in stackoverflow about this argument and i have to admit that there is a simple solution explained: the datadiff library of python helps printing the difference between two dictionaries.


回答 8

这是一种可行的方法,允许将键的值计算为False,并且在可能的情况下仍使用生成器表达式尽早退出。虽然不是特别漂亮。

any(map(lambda x: True, (k for k in b if k not in a)))

编辑:

THC4k发表了对我对另一个答案的评论的回复。这是一种更好,更漂亮的方法来执行上述操作:

any(True for k in b if k not in a)

不知道那怎么没想到…

Here’s a way that will work, allows for keys that evaluate to False, and still uses a generator expression to fall out early if possible. It’s not exceptionally pretty though.

any(map(lambda x: True, (k for k in b if k not in a)))

EDIT:

THC4k posted a reply to my comment on another answer. Here’s a better, prettier way to do the above:

any(True for k in b if k not in a)

Not sure how that never crossed my mind…


回答 9

这是一个古老的问题,要求的问题比我需要的要少,因此,此答案实际上比该问题所要求的要多。这个问题的答案帮助我解决了以下问题:

  1. (要求)记录两个词典之间的差异
  2. 将#1的差异合并到基础词典中
  3. (要求)合并两个字典之间的差异(将第2个字典视为差异字典)
  4. 尝试检测物品的移动和变化
  5. (要求)递归执行所有这些操作

所有这些与JSON相结合,提供了非常强大的配置存储支持。

解决方案(也在github上):

from collections import OrderedDict
from pprint import pprint


class izipDestinationMatching(object):
    __slots__ = ("attr", "value", "index")

    def __init__(self, attr, value, index):
        self.attr, self.value, self.index = attr, value, index

    def __repr__(self):
        return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)


def izip_destination(a, b, attrs, addMarker=True):
    """
    Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
    Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
    When addMarker == False (patching), final size will be the longer of a, b
    """
    for idx, item in enumerate(b):
        try:
            attr = next((x for x in attrs if x in item), None)  # See if the item has any of the ID attributes
            match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
            if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
        except:
            match = None
        yield (match if match else a[idx] if len(a) > idx else None), item
    if not addMarker and len(a) > len(b):
        for item in a[len(b) - len(a):]:
            yield item, item


def dictdiff(a, b, searchAttrs=[]):
    """
    returns a dictionary which represents difference from a to b
    the return dict is as short as possible:
      equal items are removed
      added / changed items are listed
      removed items are listed with value=None
    Also processes list values where the resulting list size will match that of b.
    It can also search said list items (that are dicts) for identity values to detect changed positions.
      In case such identity value is found, it is kept so that it can be re-found during the merge phase
    @param a: original dict
    @param b: new dict
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
        return b
    res = OrderedDict()
    if izipDestinationMatching in b:
        keepKey = b[izipDestinationMatching].attr
        del b[izipDestinationMatching]
    else:
        keepKey = izipDestinationMatching
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res


def dictmerge(a, b, searchAttrs=[]):
    """
    Returns a dictionary which merges differences recorded in b to base dictionary a
    Also processes list values where the resulting list size will match that of a
    It can also search said list items (that are dicts) for identity values to detect changed positions
    @param a: original dict
    @param b: diff dict to patch into a
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
        return b
    res = OrderedDict()
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        #print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
        if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
        elif key not in b: res[key] = v1
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res

This is an old question and asks a little bit less than what I needed so this answer actually solves more than this question asks. The answers in this question helped me solve the following:

  1. (asked) Record differences between two dictionaries
  2. Merge differences from #1 into base dictionary
  3. (asked) Merge differences between two dictionaries (treat dictionary #2 as if it were a diff dictionary)
  4. Try to detect item movements as well as changes
  5. (asked) Do all of this recursively

All this combined with JSON makes for a pretty powerful configuration storage support.

The solution (also on github):

from collections import OrderedDict
from pprint import pprint


class izipDestinationMatching(object):
    __slots__ = ("attr", "value", "index")

    def __init__(self, attr, value, index):
        self.attr, self.value, self.index = attr, value, index

    def __repr__(self):
        return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)


def izip_destination(a, b, attrs, addMarker=True):
    """
    Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
    Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
    When addMarker == False (patching), final size will be the longer of a, b
    """
    for idx, item in enumerate(b):
        try:
            attr = next((x for x in attrs if x in item), None)  # See if the item has any of the ID attributes
            match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
            if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
        except:
            match = None
        yield (match if match else a[idx] if len(a) > idx else None), item
    if not addMarker and len(a) > len(b):
        for item in a[len(b) - len(a):]:
            yield item, item


def dictdiff(a, b, searchAttrs=[]):
    """
    returns a dictionary which represents difference from a to b
    the return dict is as short as possible:
      equal items are removed
      added / changed items are listed
      removed items are listed with value=None
    Also processes list values where the resulting list size will match that of b.
    It can also search said list items (that are dicts) for identity values to detect changed positions.
      In case such identity value is found, it is kept so that it can be re-found during the merge phase
    @param a: original dict
    @param b: new dict
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
        return b
    res = OrderedDict()
    if izipDestinationMatching in b:
        keepKey = b[izipDestinationMatching].attr
        del b[izipDestinationMatching]
    else:
        keepKey = izipDestinationMatching
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res


def dictmerge(a, b, searchAttrs=[]):
    """
    Returns a dictionary which merges differences recorded in b to base dictionary a
    Also processes list values where the resulting list size will match that of a
    It can also search said list items (that are dicts) for identity values to detect changed positions
    @param a: original dict
    @param b: diff dict to patch into a
    @param searchAttrs: list of strings (keys to search for in sub-dicts)
    @return: dict / list / whatever input is
    """
    if not (isinstance(a, dict) and isinstance(b, dict)):
        if isinstance(a, list) and isinstance(b, list):
            return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
        return b
    res = OrderedDict()
    for key in sorted(set(a.keys() + b.keys())):
        v1 = a.get(key, None)
        v2 = b.get(key, None)
        #print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
        if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
        elif key not in b: res[key] = v1
    if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
    return res

回答 10

怎么样标准(比较完整对象)

PyDev->新的PyDev模块->模块:单元测试

import unittest


class Test(unittest.TestCase):


    def testName(self):
        obj1 = {1:1, 2:2}
        obj2 = {1:1, 2:2}
        self.maxDiff = None # sometimes is usefull
        self.assertDictEqual(d1, d2)

if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']

    unittest.main()

what about standart (compare FULL Object)

PyDev->new PyDev Module->Module: unittest

import unittest


class Test(unittest.TestCase):


    def testName(self):
        obj1 = {1:1, 2:2}
        obj2 = {1:1, 2:2}
        self.maxDiff = None # sometimes is usefull
        self.assertDictEqual(d1, d2)

if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']

    unittest.main()

回答 11

如果在Python≥2.7上:

# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise

for k in dictA.viewkeys() & dictB.viewkeys():
    if dictA[k] != dictB[k]:
        dictB[k]= dictA[k]

# add missing keys to dictA

dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )

If on Python ≥ 2.7:

# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise

for k in dictA.viewkeys() & dictB.viewkeys():
    if dictA[k] != dictB[k]:
        dictB[k]= dictA[k]

# add missing keys to dictA

dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )

回答 12

这是深度比较两个字典键的解决方案:

def compareDictKeys(dict1, dict2):
  if type(dict1) != dict or type(dict2) != dict:
      return False

  keys1, keys2 = dict1.keys(), dict2.keys()
  diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)

  if not diff:
      for key in keys1:
          if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
              diff = True
              break

  return not diff

Here is a solution for deep comparing 2 dictionaries keys:

def compareDictKeys(dict1, dict2):
  if type(dict1) != dict or type(dict2) != dict:
      return False

  keys1, keys2 = dict1.keys(), dict2.keys()
  diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)

  if not diff:
      for key in keys1:
          if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
              diff = True
              break

  return not diff

回答 13

这是一个可以比较两个以上命令的解决方案:

def diff_dict(dicts, default=None):
    diff_dict = {}
    # add 'list()' around 'd.keys()' for python 3 compatibility
    for k in set(sum([d.keys() for d in dicts], [])):
        # we can just use "values = [d.get(k, default) ..." below if 
        # we don't care that d1[k]=default and d2[k]=missing will
        # be treated as equal
        if any(k not in d for d in dicts):
            diff_dict[k] = [d.get(k, default) for d in dicts]
        else:
            values = [d[k] for d in dicts]
            if any(v != values[0] for v in values):
                diff_dict[k] = values
    return diff_dict

用法示例:

import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])

here’s a solution that can compare more than two dicts:

def diff_dict(dicts, default=None):
    diff_dict = {}
    # add 'list()' around 'd.keys()' for python 3 compatibility
    for k in set(sum([d.keys() for d in dicts], [])):
        # we can just use "values = [d.get(k, default) ..." below if 
        # we don't care that d1[k]=default and d2[k]=missing will
        # be treated as equal
        if any(k not in d for d in dicts):
            diff_dict[k] = [d.get(k, default) for d in dicts]
        else:
            values = [d[k] for d in dicts]
            if any(v != values[0] for v in values):
                diff_dict[k] = values
    return diff_dict

usage example:

import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])

回答 14

我的两个字典之间的对称差异的配方:

def find_dict_diffs(dict1, dict2):
    unequal_keys = []
    unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
    for k in dict1.keys():
        if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
            unequal_keys.append(k)
    if unequal_keys:
        print 'param', 'dict1\t', 'dict2'
        for k in set(unequal_keys):
            print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
    else:
        print 'Dicts are equal'

dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}

find_dict_diffs(dict1, dict2)

结果是:

param   dict1   dict2
1       a       b
2       b       a
5       e       N\A
6       N\A     f

My recipe of symmetric difference between two dictionaries:

def find_dict_diffs(dict1, dict2):
    unequal_keys = []
    unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
    for k in dict1.keys():
        if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
            unequal_keys.append(k)
    if unequal_keys:
        print 'param', 'dict1\t', 'dict2'
        for k in set(unequal_keys):
            print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
    else:
        print 'Dicts are equal'

dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}

find_dict_diffs(dict1, dict2)

And result is:

param   dict1   dict2
1       a       b
2       b       a
5       e       N\A
6       N\A     f

回答 15

正如其他答案中提到的那样,unittest可以生成一些不错的输出来比较dict,但是在此示例中,我们不需要先构建整个测试。

废弃unittest源代码,看起来您可以通过以下方式获得公平的解决方案:

import difflib
import pprint

def diff_dicts(a, b):
    if a == b:
        return ''
    return '\n'.join(
        difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
                      pprint.pformat(b, width=30).splitlines())
    )

所以

dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)

结果是:

{0: 112,
-  1: 121,
-  2: 116,
+  1: 'spam',
+  2: [1, 2, 3],
   3: 104,
-  4: 111,
?        ^

+  4: 111}
?        ^

-  5: 110}

哪里:

  • ‘-‘表示第一/第二个字典中的键/值
  • “ +”表示第二个而不是第一个字典中的键/值

像在单元测试中一样,唯一的警告是由于尾随逗号/括号,最终映射可以被认为是差异。

As mentioned in other answers, unittest produces some nice output for comparing dicts, but in this example we don’t want to have to build a whole test first.

Scraping the unittest source, it looks like you can get a fair solution with just this:

import difflib
import pprint

def diff_dicts(a, b):
    if a == b:
        return ''
    return '\n'.join(
        difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
                      pprint.pformat(b, width=30).splitlines())
    )

so

dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)

Results in:

{0: 112,
-  1: 121,
-  2: 116,
+  1: 'spam',
+  2: [1, 2, 3],
   3: 104,
-  4: 111,
?        ^

+  4: 111}
?        ^

-  5: 110}

Where:

  • ‘-‘ indicates key/values in the first but not second dict
  • ‘+’ indicates key/values in the second but not the first dict

Like in unittest, the only caveat is that the final mapping can be thought to be a diff, due to the trailing comma/bracket.


回答 16

@Maxx有一个很好的答案,请使用unittestPython提供的工具:

import unittest


class Test(unittest.TestCase):
    def runTest(self):
        pass

    def testDict(self, d1, d2, maxDiff=None):
        self.maxDiff = maxDiff
        self.assertDictEqual(d1, d2)

然后,您可以在代码中的任何位置调用:

try:
    Test().testDict(dict1, dict2)
except Exception, e:
    print e

结果输出看起来像来自的输出diff,用不同的行+或在-每行之前添加漂亮的字典。

@Maxx has an excellent answer, use the unittest tools provided by Python:

import unittest


class Test(unittest.TestCase):
    def runTest(self):
        pass

    def testDict(self, d1, d2, maxDiff=None):
        self.maxDiff = maxDiff
        self.assertDictEqual(d1, d2)

Then, anywhere in your code you can call:

try:
    Test().testDict(dict1, dict2)
except Exception, e:
    print e

The resulting output looks like the output from diff, pretty-printing the dictionaries with + or - prepending each line that is different.


回答 17

不知道它是否仍然有用,但是我遇到了这个问题,我的情况是我只需要返回所有嵌套字典等的变化的字典。找不到合适的解决方案,但是我最终写了一个简单的函数要做到这一点。希望这可以帮助,

Not sure if it is still relevant but I came across this problem, my situation i just needed to return a dictionary of the changes for all nested dictionaries etc etc. Could not find a good solution out there but I did end up writing a simple function to do this. Hope this helps,


回答 18

如果您想使用内置解决方案与任意dict结构进行全面比较,@ Maxx的答案就是一个好的开始。

import unittest

test = unittest.TestCase()
test.assertEqual(dictA, dictB)

If you want a built-in solution for a full comparison with arbitrary dict structures, @Maxx’s answer is a good start.

import unittest

test = unittest.TestCase()
test.assertEqual(dictA, dictB)

回答 19

根据ghostdog74的回答,

dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}

for value in dicta.values():
    if not value in dictb.values():
        print value

将打印不同的dicta值

Based on ghostdog74’s answer,

dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}

for value in dicta.values():
    if not value in dictb.values():
        print value

will print differ value of dicta


回答 20

尝试此操作以找到de交集,即两个字典中的键,如果要在第二个字典中找不到键,只需使用not in

intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())

Try this to find de intersection, the keys that is in both dictionarie, if you want the keys not found on second dictionarie, just use the not in

intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())