问题:计算两个Python字典中包含的键的差异
假设我有两个Python字典- dictA
和dictB
。我需要找出是否有任何键存在于中,dictB
但没有dictA
。最快的方法是什么?
我应该将字典键转换为集合然后继续吗?
有兴趣了解您的想法…
感谢您的回复。
很抱歉未能正确说明我的问题。我的情况是这样的-我有一个dictA
与可能相同的dictB
密钥,或者可能缺少一些密钥,dictB
否则某些密钥的值可能会有所不同,必须将其设置为dictA
密钥的值。
问题在于字典没有标准,并且可以具有可以作为dict的值。
说
dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......
因此,必须将“ key2”值重置为新值,并在字典内部添加“ key13”。键值没有固定的格式。它可以是一个简单的值或dict或dict的dict。
Suppose I have two Python dictionaries – dictA
and dictB
. I need to find out if there are any keys which are present in dictB
but not in dictA
. What is the fastest way to go about it?
Should I convert the dictionary keys into a set and then go about?
Interested in knowing your thoughts…
Thanks for your responses.
Apologies for not stating my question properly.
My scenario is like this – I have a dictA
which can be the same as dictB
or may have some keys missing as compared to dictB
or else the value of some keys might be different which has to be set to that of dictA
key’s value.
Problem is the dictionary has no standard and can have values which can be dict of dict.
Say
dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......
So ‘key2’ value has to be reset to the new value and ‘key13’ has to be added inside the dict.
The key value does not have a fixed format. It can be a simple value or a dict or a dict of dict.
回答 0
您可以在按键上使用设置操作:
diff = set(dictb.keys()) - set(dicta.keys())
这是一个查找所有可能性的类:添加了什么,删除了什么,哪些键值对相同以及哪些键值对已更改。
class DictDiffer(object):
"""
Calculate the difference between two dictionaries as:
(1) items added
(2) items removed
(3) keys same in both but changed values
(4) keys same in both and unchanged values
"""
def __init__(self, current_dict, past_dict):
self.current_dict, self.past_dict = current_dict, past_dict
self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
self.intersect = self.set_current.intersection(self.set_past)
def added(self):
return self.set_current - self.intersect
def removed(self):
return self.set_past - self.intersect
def changed(self):
return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
def unchanged(self):
return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])
这是一些示例输出:
>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])
可以作为github存储库使用:https :
//github.com/hughdbrown/dictdiffer
You can use set operations on the keys:
diff = set(dictb.keys()) - set(dicta.keys())
Here is a class to find all the possibilities: what was added, what was removed, which key-value pairs are the same, and which key-value pairs are changed.
class DictDiffer(object):
"""
Calculate the difference between two dictionaries as:
(1) items added
(2) items removed
(3) keys same in both but changed values
(4) keys same in both and unchanged values
"""
def __init__(self, current_dict, past_dict):
self.current_dict, self.past_dict = current_dict, past_dict
self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
self.intersect = self.set_current.intersection(self.set_past)
def added(self):
return self.set_current - self.intersect
def removed(self):
return self.set_past - self.intersect
def changed(self):
return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
def unchanged(self):
return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])
Here is some sample output:
>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])
Available as a github repo:
https://github.com/hughdbrown/dictdiffer
回答 1
如果您需要递归的区别,我已经为python编写了一个软件包:https :
//github.com/seperman/deepdiff
安装
从PyPi安装:
pip install deepdiff
用法示例
输入
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2
同一对象返回空
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}
项目类型已更改
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
'newvalue': '2',
'oldtype': <class 'int'>,
'oldvalue': 2}}}
项目的价值已更改
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
添加和/或删除项目
>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
'dic_item_removed': ['root[4]'],
'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
弦差异
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
"root[4]['b']": { 'newvalue': 'world!',
'oldvalue': 'world'}}}
弦差异2
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
'+++ \n'
'@@ -1,5 +1,4 @@\n'
'-world!\n'
'-Goodbye!\n'
'+world\n'
' 1\n'
' 2\n'
' End',
'newvalue': 'world\n1\n2\nEnd',
'oldvalue': 'world!\n'
'Goodbye!\n'
'1\n'
'2\n'
'End'}}}
>>>
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
---
+++
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
1
2
End
类型变更
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
'newvalue': 'world\n\n\nEnd',
'oldtype': <class 'list'>,
'oldvalue': [1, 2, 3]}}}
清单差异
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}
清单差异2:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
"root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}
列出差异忽略顺序或重复项:(具有与上述相同的字典)
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}
包含字典的列表:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}
套装:
>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}
命名元组:
>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}
自定义对象:
>>> class ClassA(object):
... a = 1
... def __init__(self, b):
... self.b = b
...
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>>
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
添加对象属性:
>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
In case you want the difference recursively, I have written a package for python:
https://github.com/seperman/deepdiff
Installation
Install from PyPi:
pip install deepdiff
Example usage
Importing
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2
Same object returns empty
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}
Type of an item has changed
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
'newvalue': '2',
'oldtype': <class 'int'>,
'oldvalue': 2}}}
Value of an item has changed
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
Item added and/or removed
>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
'dic_item_removed': ['root[4]'],
'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
String difference
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
"root[4]['b']": { 'newvalue': 'world!',
'oldvalue': 'world'}}}
String difference 2
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
'+++ \n'
'@@ -1,5 +1,4 @@\n'
'-world!\n'
'-Goodbye!\n'
'+world\n'
' 1\n'
' 2\n'
' End',
'newvalue': 'world\n1\n2\nEnd',
'oldvalue': 'world!\n'
'Goodbye!\n'
'1\n'
'2\n'
'End'}}}
>>>
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
---
+++
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
1
2
End
Type change
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
'newvalue': 'world\n\n\nEnd',
'oldtype': <class 'list'>,
'oldvalue': [1, 2, 3]}}}
List difference
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}
List difference 2:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
"root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}
List difference ignoring order or duplicates: (with the same dictionaries as above)
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}
List that contains dictionary:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}
Sets:
>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}
Named Tuples:
>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}
Custom objects:
>>> class ClassA(object):
... a = 1
... def __init__(self, b):
... self.b = b
...
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>>
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
Object attribute added:
>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
回答 2
不知道它是否“快速”,但是通常情况下,可以做到这一点
dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
if not key in dictb:
print key
not sure whether its “fast” or not, but normally, one can do this
dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
if not key in dictb:
print key
回答 3
就像Alex Martelli所写的那样,如果您只想检查B中的任何键是否不在A中,那any(True for k in dictB if k not in dictA)
将是您的最佳选择。
要查找缺少的密钥:
diff = set(dictB)-set(dictA) #sets
C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop
diff = [ k for k in dictB if k not in dictA ] #lc
C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop
因此,这两种解决方案的速度几乎相同。
As Alex Martelli wrote, if you simply want to check if any key in B is not in A, any(True for k in dictB if k not in dictA)
would be the way to go.
To find the keys that are missing:
diff = set(dictB)-set(dictA) #sets
C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop
diff = [ k for k in dictB if k not in dictA ] #lc
C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop
So those two solutions are pretty much the same speed.
回答 4
如果您确实要说的是真的(您只需要找出B中而不是A中“有任何键”的情况,那么ONES可能没有),最快的方法应该是:
if any(True for k in dictB if k not in dictA): ...
如果您实际上需要找出哪个键(如果有)在B中而不是在A中,而不仅仅是“ IF”,那么有这样的键,那么现有的答案就很合适了(但是我建议在以后的问题中更精确一些,如果那是确实是您的意思;-)。
If you really mean exactly what you say (that you only need to find out IF “there are any keys” in B and not in A, not WHICH ONES might those be if any), the fastest way should be:
if any(True for k in dictB if k not in dictA): ...
If you actually need to find out WHICH KEYS, if any, are in B and not in A, and not just “IF” there are such keys, then existing answers are quite appropriate (but I do suggest more precision in future questions if that’s indeed what you mean;-).
回答 5
用途set()
:
set(dictA.keys()).intersection(dictB.keys())
Use set()
:
set(dictA.keys()).intersection(dictB.keys())
回答 6
hughdbrown的最高答案是建议使用集差异,这绝对是最好的方法:
diff = set(dictb.keys()) - set(dicta.keys())
这段代码的问题在于,它仅创建两个列表就创建了两个列表,因此浪费了4N的时间和2N的空间。它也比需要的要复杂一些。
通常,这没什么大不了的,但是如果是这样的话:
diff = dictb.keys() - dicta
Python 2
在Python 2中,keys()
返回键列表,而不是KeysView
。因此,您必须viewkeys()
直接提出要求。
diff = dictb.viewkeys() - dicta
对于双版本2.7 / 3.x代码,希望使用six
或类似的代码,因此可以使用six.viewkeys(dictb)
:
diff = six.viewkeys(dictb) - dicta
在2.4-2.6中,没有KeysView
。但是,您可以直接从迭代器中构建左集合,而不是先构建列表,至少可以将成本从4N削减到N:
diff = set(dictb) - dicta
物品
我有一个dictA可以与dictB相同,或者与dictB相比可能缺少一些键,否则某些键的值可能不同
因此,您实际上不需要比较键,而是需要比较项。ItemsView
仅Set
当值是可哈希值(例如字符串)时,an 才是a 。如果是这样,这很容易:
diff = dictb.items() - dicta.items()
递归差异
尽管问题不是直接要求递归差异,但某些示例值是dict,并且看来预期的输出确实递归地对它们进行差异。这里已经有多个答案显示了如何执行此操作。
The top answer by hughdbrown suggests using set difference, which is definitely the best approach:
diff = set(dictb.keys()) - set(dicta.keys())
The problem with this code is that it builds two lists just to create two sets, so it’s wasting 4N time and 2N space. It’s also a bit more complicated than it needs to be.
Usually, this is not a big deal, but if it is:
diff = dictb.keys() - dicta
Python 2
In Python 2, keys()
returns a list of the keys, not a KeysView
. So you have to ask for viewkeys()
directly.
diff = dictb.viewkeys() - dicta
For dual-version 2.7/3.x code, you’re hopefully using six
or something similar, so you can use six.viewkeys(dictb)
:
diff = six.viewkeys(dictb) - dicta
In 2.4-2.6, there is no KeysView
. But you can at least cut the cost from 4N to N by building your left set directly out of an iterator, instead of building a list first:
diff = set(dictb) - dicta
Items
I have a dictA which can be the same as dictB or may have some keys missing as compared to dictB or else the value of some keys might be different
So you really don’t need to compare the keys, but the items. An ItemsView
is only a Set
if the values are hashable, like strings. If they are, it’s easy:
diff = dictb.items() - dicta.items()
Recursive diff
Although the question isn’t directly asking for a recursive diff, some of the example values are dicts, and it appears the expected output does recursively diff them. There are already multiple answers here showing how to do that.
回答 7
回答 8
这是一种可行的方法,允许将键的值计算为False
,并且在可能的情况下仍使用生成器表达式尽早退出。虽然不是特别漂亮。
any(map(lambda x: True, (k for k in b if k not in a)))
编辑:
THC4k发表了对我对另一个答案的评论的回复。这是一种更好,更漂亮的方法来执行上述操作:
any(True for k in b if k not in a)
不知道那怎么没想到…
Here’s a way that will work, allows for keys that evaluate to False
, and still uses a generator expression to fall out early if possible. It’s not exceptionally pretty though.
any(map(lambda x: True, (k for k in b if k not in a)))
EDIT:
THC4k posted a reply to my comment on another answer. Here’s a better, prettier way to do the above:
any(True for k in b if k not in a)
Not sure how that never crossed my mind…
回答 9
这是一个古老的问题,要求的问题比我需要的要少,因此,此答案实际上比该问题所要求的要多。这个问题的答案帮助我解决了以下问题:
- (要求)记录两个词典之间的差异
- 将#1的差异合并到基础词典中
- (要求)合并两个字典之间的差异(将第2个字典视为差异字典)
- 尝试检测物品的移动和变化
- (要求)递归执行所有这些操作
所有这些与JSON相结合,提供了非常强大的配置存储支持。
解决方案(也在github上):
from collections import OrderedDict
from pprint import pprint
class izipDestinationMatching(object):
__slots__ = ("attr", "value", "index")
def __init__(self, attr, value, index):
self.attr, self.value, self.index = attr, value, index
def __repr__(self):
return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)
def izip_destination(a, b, attrs, addMarker=True):
"""
Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
When addMarker == False (patching), final size will be the longer of a, b
"""
for idx, item in enumerate(b):
try:
attr = next((x for x in attrs if x in item), None) # See if the item has any of the ID attributes
match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
except:
match = None
yield (match if match else a[idx] if len(a) > idx else None), item
if not addMarker and len(a) > len(b):
for item in a[len(b) - len(a):]:
yield item, item
def dictdiff(a, b, searchAttrs=[]):
"""
returns a dictionary which represents difference from a to b
the return dict is as short as possible:
equal items are removed
added / changed items are listed
removed items are listed with value=None
Also processes list values where the resulting list size will match that of b.
It can also search said list items (that are dicts) for identity values to detect changed positions.
In case such identity value is found, it is kept so that it can be re-found during the merge phase
@param a: original dict
@param b: new dict
@param searchAttrs: list of strings (keys to search for in sub-dicts)
@return: dict / list / whatever input is
"""
if not (isinstance(a, dict) and isinstance(b, dict)):
if isinstance(a, list) and isinstance(b, list):
return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
return b
res = OrderedDict()
if izipDestinationMatching in b:
keepKey = b[izipDestinationMatching].attr
del b[izipDestinationMatching]
else:
keepKey = izipDestinationMatching
for key in sorted(set(a.keys() + b.keys())):
v1 = a.get(key, None)
v2 = b.get(key, None)
if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
if len(res) <= 1: res = dict(res) # This is only here for pretty print (OrderedDict doesn't pprint nicely)
return res
def dictmerge(a, b, searchAttrs=[]):
"""
Returns a dictionary which merges differences recorded in b to base dictionary a
Also processes list values where the resulting list size will match that of a
It can also search said list items (that are dicts) for identity values to detect changed positions
@param a: original dict
@param b: diff dict to patch into a
@param searchAttrs: list of strings (keys to search for in sub-dicts)
@return: dict / list / whatever input is
"""
if not (isinstance(a, dict) and isinstance(b, dict)):
if isinstance(a, list) and isinstance(b, list):
return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
return b
res = OrderedDict()
for key in sorted(set(a.keys() + b.keys())):
v1 = a.get(key, None)
v2 = b.get(key, None)
#print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
elif key not in b: res[key] = v1
if len(res) <= 1: res = dict(res) # This is only here for pretty print (OrderedDict doesn't pprint nicely)
return res
This is an old question and asks a little bit less than what I needed so this answer actually solves more than this question asks. The answers in this question helped me solve the following:
- (asked) Record differences between two dictionaries
- Merge differences from #1 into base dictionary
- (asked) Merge differences between two dictionaries (treat dictionary #2 as if it were a diff dictionary)
- Try to detect item movements as well as changes
- (asked) Do all of this recursively
All this combined with JSON makes for a pretty powerful configuration storage support.
The solution (also on github):
from collections import OrderedDict
from pprint import pprint
class izipDestinationMatching(object):
__slots__ = ("attr", "value", "index")
def __init__(self, attr, value, index):
self.attr, self.value, self.index = attr, value, index
def __repr__(self):
return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)
def izip_destination(a, b, attrs, addMarker=True):
"""
Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
When addMarker == False (patching), final size will be the longer of a, b
"""
for idx, item in enumerate(b):
try:
attr = next((x for x in attrs if x in item), None) # See if the item has any of the ID attributes
match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
except:
match = None
yield (match if match else a[idx] if len(a) > idx else None), item
if not addMarker and len(a) > len(b):
for item in a[len(b) - len(a):]:
yield item, item
def dictdiff(a, b, searchAttrs=[]):
"""
returns a dictionary which represents difference from a to b
the return dict is as short as possible:
equal items are removed
added / changed items are listed
removed items are listed with value=None
Also processes list values where the resulting list size will match that of b.
It can also search said list items (that are dicts) for identity values to detect changed positions.
In case such identity value is found, it is kept so that it can be re-found during the merge phase
@param a: original dict
@param b: new dict
@param searchAttrs: list of strings (keys to search for in sub-dicts)
@return: dict / list / whatever input is
"""
if not (isinstance(a, dict) and isinstance(b, dict)):
if isinstance(a, list) and isinstance(b, list):
return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
return b
res = OrderedDict()
if izipDestinationMatching in b:
keepKey = b[izipDestinationMatching].attr
del b[izipDestinationMatching]
else:
keepKey = izipDestinationMatching
for key in sorted(set(a.keys() + b.keys())):
v1 = a.get(key, None)
v2 = b.get(key, None)
if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
if len(res) <= 1: res = dict(res) # This is only here for pretty print (OrderedDict doesn't pprint nicely)
return res
def dictmerge(a, b, searchAttrs=[]):
"""
Returns a dictionary which merges differences recorded in b to base dictionary a
Also processes list values where the resulting list size will match that of a
It can also search said list items (that are dicts) for identity values to detect changed positions
@param a: original dict
@param b: diff dict to patch into a
@param searchAttrs: list of strings (keys to search for in sub-dicts)
@return: dict / list / whatever input is
"""
if not (isinstance(a, dict) and isinstance(b, dict)):
if isinstance(a, list) and isinstance(b, list):
return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
return b
res = OrderedDict()
for key in sorted(set(a.keys() + b.keys())):
v1 = a.get(key, None)
v2 = b.get(key, None)
#print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
elif key not in b: res[key] = v1
if len(res) <= 1: res = dict(res) # This is only here for pretty print (OrderedDict doesn't pprint nicely)
return res
回答 10
怎么样标准(比较完整对象)
PyDev->新的PyDev模块->模块:单元测试
import unittest
class Test(unittest.TestCase):
def testName(self):
obj1 = {1:1, 2:2}
obj2 = {1:1, 2:2}
self.maxDiff = None # sometimes is usefull
self.assertDictEqual(d1, d2)
if __name__ == "__main__":
#import sys;sys.argv = ['', 'Test.testName']
unittest.main()
what about standart (compare FULL Object)
PyDev->new PyDev Module->Module: unittest
import unittest
class Test(unittest.TestCase):
def testName(self):
obj1 = {1:1, 2:2}
obj2 = {1:1, 2:2}
self.maxDiff = None # sometimes is usefull
self.assertDictEqual(d1, d2)
if __name__ == "__main__":
#import sys;sys.argv = ['', 'Test.testName']
unittest.main()
回答 11
如果在Python≥2.7上:
# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise
for k in dictA.viewkeys() & dictB.viewkeys():
if dictA[k] != dictB[k]:
dictB[k]= dictA[k]
# add missing keys to dictA
dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )
If on Python ≥ 2.7:
# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise
for k in dictA.viewkeys() & dictB.viewkeys():
if dictA[k] != dictB[k]:
dictB[k]= dictA[k]
# add missing keys to dictA
dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )
回答 12
这是深度比较两个字典键的解决方案:
def compareDictKeys(dict1, dict2):
if type(dict1) != dict or type(dict2) != dict:
return False
keys1, keys2 = dict1.keys(), dict2.keys()
diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)
if not diff:
for key in keys1:
if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
diff = True
break
return not diff
Here is a solution for deep comparing 2 dictionaries keys:
def compareDictKeys(dict1, dict2):
if type(dict1) != dict or type(dict2) != dict:
return False
keys1, keys2 = dict1.keys(), dict2.keys()
diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)
if not diff:
for key in keys1:
if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
diff = True
break
return not diff
回答 13
这是一个可以比较两个以上命令的解决方案:
def diff_dict(dicts, default=None):
diff_dict = {}
# add 'list()' around 'd.keys()' for python 3 compatibility
for k in set(sum([d.keys() for d in dicts], [])):
# we can just use "values = [d.get(k, default) ..." below if
# we don't care that d1[k]=default and d2[k]=missing will
# be treated as equal
if any(k not in d for d in dicts):
diff_dict[k] = [d.get(k, default) for d in dicts]
else:
values = [d[k] for d in dicts]
if any(v != values[0] for v in values):
diff_dict[k] = values
return diff_dict
用法示例:
import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])
here’s a solution that can compare more than two dicts:
def diff_dict(dicts, default=None):
diff_dict = {}
# add 'list()' around 'd.keys()' for python 3 compatibility
for k in set(sum([d.keys() for d in dicts], [])):
# we can just use "values = [d.get(k, default) ..." below if
# we don't care that d1[k]=default and d2[k]=missing will
# be treated as equal
if any(k not in d for d in dicts):
diff_dict[k] = [d.get(k, default) for d in dicts]
else:
values = [d[k] for d in dicts]
if any(v != values[0] for v in values):
diff_dict[k] = values
return diff_dict
usage example:
import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])
回答 14
我的两个字典之间的对称差异的配方:
def find_dict_diffs(dict1, dict2):
unequal_keys = []
unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
for k in dict1.keys():
if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
unequal_keys.append(k)
if unequal_keys:
print 'param', 'dict1\t', 'dict2'
for k in set(unequal_keys):
print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
else:
print 'Dicts are equal'
dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}
find_dict_diffs(dict1, dict2)
结果是:
param dict1 dict2
1 a b
2 b a
5 e N\A
6 N\A f
My recipe of symmetric difference between two dictionaries:
def find_dict_diffs(dict1, dict2):
unequal_keys = []
unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
for k in dict1.keys():
if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
unequal_keys.append(k)
if unequal_keys:
print 'param', 'dict1\t', 'dict2'
for k in set(unequal_keys):
print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
else:
print 'Dicts are equal'
dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}
find_dict_diffs(dict1, dict2)
And result is:
param dict1 dict2
1 a b
2 b a
5 e N\A
6 N\A f
回答 15
正如其他答案中提到的那样,unittest可以生成一些不错的输出来比较dict,但是在此示例中,我们不需要先构建整个测试。
废弃unittest源代码,看起来您可以通过以下方式获得公平的解决方案:
import difflib
import pprint
def diff_dicts(a, b):
if a == b:
return ''
return '\n'.join(
difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
pprint.pformat(b, width=30).splitlines())
)
所以
dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)
结果是:
{0: 112,
- 1: 121,
- 2: 116,
+ 1: 'spam',
+ 2: [1, 2, 3],
3: 104,
- 4: 111,
? ^
+ 4: 111}
? ^
- 5: 110}
哪里:
- ‘-‘表示第一/第二个字典中的键/值
- “ +”表示第二个而不是第一个字典中的键/值
像在单元测试中一样,唯一的警告是由于尾随逗号/括号,最终映射可以被认为是差异。
As mentioned in other answers, unittest produces some nice output for comparing dicts, but in this example we don’t want to have to build a whole test first.
Scraping the unittest source, it looks like you can get a fair solution with just this:
import difflib
import pprint
def diff_dicts(a, b):
if a == b:
return ''
return '\n'.join(
difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
pprint.pformat(b, width=30).splitlines())
)
so
dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)
Results in:
{0: 112,
- 1: 121,
- 2: 116,
+ 1: 'spam',
+ 2: [1, 2, 3],
3: 104,
- 4: 111,
? ^
+ 4: 111}
? ^
- 5: 110}
Where:
- ‘-‘ indicates key/values in the first but not second dict
- ‘+’ indicates key/values in the second but not the first dict
Like in unittest, the only caveat is that the final mapping can be thought to be a diff, due to the trailing comma/bracket.
回答 16
@Maxx有一个很好的答案,请使用unittest
Python提供的工具:
import unittest
class Test(unittest.TestCase):
def runTest(self):
pass
def testDict(self, d1, d2, maxDiff=None):
self.maxDiff = maxDiff
self.assertDictEqual(d1, d2)
然后,您可以在代码中的任何位置调用:
try:
Test().testDict(dict1, dict2)
except Exception, e:
print e
结果输出看起来像来自的输出diff
,用不同的行+
或在-
每行之前添加漂亮的字典。
@Maxx has an excellent answer, use the unittest
tools provided by Python:
import unittest
class Test(unittest.TestCase):
def runTest(self):
pass
def testDict(self, d1, d2, maxDiff=None):
self.maxDiff = maxDiff
self.assertDictEqual(d1, d2)
Then, anywhere in your code you can call:
try:
Test().testDict(dict1, dict2)
except Exception, e:
print e
The resulting output looks like the output from diff
, pretty-printing the dictionaries with +
or -
prepending each line that is different.
回答 17
不知道它是否仍然有用,但是我遇到了这个问题,我的情况是我只需要返回所有嵌套字典等的变化的字典。找不到合适的解决方案,但是我最终写了一个简单的函数要做到这一点。希望这可以帮助,
Not sure if it is still relevant but I came across this problem, my situation i just needed to return a dictionary of the changes for all nested dictionaries etc etc. Could not find a good solution out there but I did end up writing a simple function to do this. Hope this helps,
回答 18
如果您想使用内置解决方案与任意dict结构进行全面比较,@ Maxx的答案就是一个好的开始。
import unittest
test = unittest.TestCase()
test.assertEqual(dictA, dictB)
If you want a built-in solution for a full comparison with arbitrary dict structures, @Maxx’s answer is a good start.
import unittest
test = unittest.TestCase()
test.assertEqual(dictA, dictB)
回答 19
根据ghostdog74的回答,
dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}
for value in dicta.values():
if not value in dictb.values():
print value
将打印不同的dicta值
Based on ghostdog74’s answer,
dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}
for value in dicta.values():
if not value in dictb.values():
print value
will print differ value of dicta
回答 20
尝试此操作以找到de交集,即两个字典中的键,如果要在第二个字典中找不到键,只需使用not in …
intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())
Try this to find de intersection, the keys that is in both dictionarie, if you want the keys not found on second dictionarie, just use the not in…
intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())