Python字典是哈希表的示例吗?

问题:Python字典是哈希表的示例吗?

字典是Python中的一种基本数据结构,它允许记录“键”以查找任何类型的“值”。这在内部实现为哈希表吗?如果没有,那是什么?

One of the basic data structures in Python is the dictionary, which allows one to record “keys” for looking up “values” of any type. Is this implemented internally as a hash table? If not, what is it?


回答 0

是的,它是一个哈希映射或哈希表。您可以在此处阅读由蒂姆·彼得斯(Tim Peters)编写的有关python dict的实现的描述。

这就是为什么您不能使用“不可散列”的东西作为字典键(例如列表)的原因:

>>> a = {}
>>> b = ['some', 'list']
>>> hash(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable
>>> a[b] = 'some'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable

您可以阅读有关散列表的更多信息,查看它如何在python中实现以及为什么以这种方式实现

Yes, it is a hash mapping or hash table. You can read a description of python’s dict implementation, as written by Tim Peters, here.

That’s why you can’t use something ‘not hashable’ as a dict key, like a list:

>>> a = {}
>>> b = ['some', 'list']
>>> hash(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable
>>> a[b] = 'some'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable

You can read more about hash tables or check how it has been implemented in python and why it is implemented that way.


回答 1

除了在hash()上进行表查找之外,Python词典还必须有更多内容。通过残酷的实验,我发现了这种哈希冲突

>>> hash(1.1)
2040142438
>>> hash(4504.1)
2040142438

但这并没有破坏字典:

>>> d = { 1.1: 'a', 4504.1: 'b' }
>>> d[1.1]
'a'
>>> d[4504.1]
'b'

完整性检查:

>>> for k,v in d.items(): print(hash(k))
2040142438
2040142438

可能除了hash()之外还有另一种查找级别,可以避免字典键之间的冲突。也许dict()使用不同的哈希值。

(顺便说一句,在python 2.7.10中是这样。在Python 3.4.3和3.5.0中是相同的,但在处发生了冲突hash(1.1) == hash(214748749.8)。)

There must be more to a Python dictionary than a table lookup on hash(). By brute experimentation I found this hash collision:

>>> hash(1.1)
2040142438
>>> hash(4504.1)
2040142438

Yet it doesn’t break the dictionary:

>>> d = { 1.1: 'a', 4504.1: 'b' }
>>> d[1.1]
'a'
>>> d[4504.1]
'b'

Sanity check:

>>> for k,v in d.items(): print(hash(k))
2040142438
2040142438

Possibly there’s another lookup level beyond hash() that avoids collisions between dictionary keys. Or maybe dict() uses a different hash.

(By the way, this in Python 2.7.10. Same story in Python 3.4.3 and 3.5.0 with a collision at hash(1.1) == hash(214748749.8).)


回答 2

是。在内部,它被实现为基于Z / 2()上的原始多项式的开放式哈希。

Yes. Internally it is implemented as open hashing based on a primitive polynomial over Z/2 (source).


回答 3

为了扩展nosklo的解释:

a = {}
b = ['some', 'list']
a[b] = 'some' # this won't work
a[tuple(b)] = 'some' # this will, same as a['some', 'list']

To expand upon nosklo’s explanation:

a = {}
b = ['some', 'list']
a[b] = 'some' # this won't work
a[tuple(b)] = 'some' # this will, same as a['some', 'list']