为什么我不能在Python中使用列表作为字典键?

问题:为什么我不能在Python中使用列表作为字典键?

对于什么可以/不能用作python dict的键,我有些困惑。

dicked = {}
dicked[None] = 'foo'     # None ok
dicked[(1,3)] = 'baz'    # tuple ok
import sys
dicked[sys] = 'bar'      # wow, even a module is ok !
dicked[(1,[3])] = 'qux'  # oops, not allowed

因此,元组是一个不可变的类型,但是如果我在其中隐藏一个列表,那么它就不能成为键。.我不能像在模块内部一样轻松地隐藏一个列表吗?

我有一个模糊的想法,认为密钥必须是“可哈希的”,但是我只是承认自己对技术细节的无知。我不知道这里到底发生了什么。如果您尝试将列表用作键,而将哈希作为其存储位置,那会出什么问题呢?

I’m a bit confused about what can/can’t be used as a key for a python dict.

dicked = {}
dicked[None] = 'foo'     # None ok
dicked[(1,3)] = 'baz'    # tuple ok
import sys
dicked[sys] = 'bar'      # wow, even a module is ok !
dicked[(1,[3])] = 'qux'  # oops, not allowed

So a tuple is an immutable type but if I hide a list inside of it, then it can’t be a key.. couldn’t I just as easily hide a list inside a module?

I had some vague idea that that the key has to be “hashable” but I’m just going to admit my own ignorance about the technical details; I don’t know what’s really going on here. What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?


回答 0

Python Wiki中有一篇关于该主题的好文章:为什么列表不能成为字典键。如此处所述:

如果您尝试将列表用作键,而将哈希作为其存储位置,那会出什么问题呢?

可以在不真正破坏任何要求的情况下完成此操作,但是会导致意外的行为。通常将列表视为其值是从其内容的值派生的,例如在检查(不等式)时。可以理解的是,许多人希望您可以使用任何列表[1, 2]来获取相同的键,而您必须在其中保留完全相同的列表对象。但是,一旦修改了用作键的列表,按值查找就会中断,并且要通过标识查找,您需要保持完全相同的列表-这不需要任何其他常见的列表操作(至少我不能想到) )。

object无论如何,其他对象(例如模块)都会通过它们的对象标识产生更大的影响(这是您最后一次有两个不同的名为sys?的模块对象),并且无论如何都要进行比较。因此,当它们用作dict键时,在这种情况下也按标识进行比较就不足为奇了-甚至没有想到。

There’s a good article on the topic in the Python wiki: Why Lists Can’t Be Dictionary Keys. As explained there:

What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?

It can be done without really breaking any of the requirements, but it leads to unexpected behavior. Lists are generally treated as if their value was derived from their content’s values, for instance when checking (in-)equality. Many would – understandably – expect that you can use any list [1, 2] to get the same key, where you’d have to keep around exactly the same list object. But lookup by value breaks as soon as a list used as key is modified, and for lookup by identity requires you to keep around exactly the same list – which isn’t requires for any other common list operation (at least none I can think of).

Other objects such as modules and object make a much bigger deal out of their object identity anyway (when was the last time you had two distinct module objects called sys?), and are compared by that anyway. Therefore, it’s less surprising – or even expected – that they, when used as dict keys, compare by identity in that case as well.


回答 1

为什么我不能在Python中使用列表作为字典键?

>>> d = {repr([1,2,3]): 'value'}
{'[1, 2, 3]': 'value'}

(对于任何偶然发现此问题以寻求解决方案的人)

正如这里其他人所解释的,实际上您不能。但是,如果您确实要使用列表,则可以使用其字符串表示形式。

Why can’t I use a list as a dict key in python?

>>> d = {repr([1,2,3]): 'value'}
{'[1, 2, 3]': 'value'}

(for anybody who stumbles on this question looking for a way around it)

as explained by others here, indeed you cannot. You can however use its string representation instead if you really want to use your list.


回答 2

刚发现您可以将List更改为元组,然后将其用作键。

d = {tuple([1,2,3]): 'value'}

Just found you can change List into tuple, then use it as keys.

d = {tuple([1,2,3]): 'value'}

回答 3

问题在于元组是不可变的,而列表不是。考虑以下

d = {}
li = [1,2,3]
d[li] = 5
li.append(4)

应该d[li]返回什么?是相同的清单吗?怎么d[[1,2,3]]样 它具有相同的值,但列表不同吗?

最终,没有令人满意的答案。例如,如果唯一起作用的键是原始键,那么如果您没有对该键的引用,则无法再访问该值。使用其他所有允许的密钥,您可以构造一个密钥,而无需参考原始密钥。

如果我的两个建议都起作用,那么您将拥有非常不同的键,它们返回相同的值,这有点令人惊讶。如果仅原始内容有效,则您的密钥将很快失效,因为已修改了列表。

The issue is that tuples are immutable, and lists are not. Consider the following

d = {}
li = [1,2,3]
d[li] = 5
li.append(4)

What should d[li] return? Is it the same list? How about d[[1,2,3]]? It has the same values, but is a different list?

Ultimately, there is no satisfactory answer. For example, if the only key that works is the original key, then if you have no reference to that key, you can never again access the value. With every other allowed key, you can construct a key without a reference to the original.

If both of my suggestions work, then you have very different keys that return the same value, which is more than a little surprising. If only the original contents work, then your key will quickly go bad, since lists are made to be modified.


回答 4

这是一个答案http://wiki.python.org/moin/DictionaryKeys

如果您尝试将列表用作键,而将哈希作为其存储位置,那会出什么问题呢?

查找具有相同内容的不同列表将产生不同的结果,即使比较具有相同内容的列表也将它们视为等效。

在字典查找中使用列表文字怎么办?

Here’s an answer http://wiki.python.org/moin/DictionaryKeys

What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?

Looking up different lists with the same contents would produce different results, even though comparing lists with the same contents would indicate them as equivalent.

What about Using a list literal in a dictionary lookup?


回答 5

您的遮阳篷可以在这里找到:

为什么列表不能成为字典键

Python的新手常常想知道为什么,尽管语言既包含元组又包含列表类型,但是元组可用作字典键,而列表却不可用。这是一个经过深思熟虑的设计决定,可以通过首先了解Python词典的工作方式来最好地解释。

来源和更多信息:http : //wiki.python.org/moin/DictionaryKeys

Your awnser can be found here:

Why Lists Can’t Be Dictionary Keys

Newcomers to Python often wonder why, while the language includes both a tuple and a list type, tuples are usable as a dictionary keys, while lists are not. This was a deliberate design decision, and can best be explained by first understanding how Python dictionaries work.

Source & more info: http://wiki.python.org/moin/DictionaryKeys


回答 6

因为列表是可变的,所以dict键(和set成员)必须是可哈希的,并且对可变对象进行哈希处理是一个坏主意,因为哈希值基于实例属性进行计算。

在这个答案中,我将给出一些具体的例子,希望在现有答案的基础上增加价值。每个洞察力也适用于数据set结构的元素。

示例1:哈希可变对象,其中哈希值基于对象的可变特性。

>>> class stupidlist(list):
...     def __hash__(self):
...         return len(self)
... 
>>> stupid = stupidlist([1, 2, 3])
>>> d = {stupid: 0}
>>> stupid.append(4)
>>> stupid
[1, 2, 3, 4]
>>> d
{[1, 2, 3, 4]: 0}
>>> stupid in d
False
>>> stupid in d.keys()
False
>>> stupid in list(d.keys())
True

突变后stupid,不能在字典不再因为散列变化发现。仅对字典的键列表进行线性扫描才能找到stupid

例2:…但是为什么不只是一个恒定的哈希值?

>>> class stupidlist2(list):
...     def __hash__(self):
...         return id(self)
... 
>>> stupidA = stupidlist2([1, 2, 3])
>>> stupidB = stupidlist2([1, 2, 3])
>>> 
>>> stupidA == stupidB
True
>>> stupidA in {stupidB: 0}
False

这也不是一个好主意,因为相等的对象应该相同地散列,以便您可以在 dict或中set

例子3:…好吧,在所有实例中保持不变的哈希值呢?

>>> class stupidlist3(list):
...     def __hash__(self):
...         return 1
... 
>>> stupidC = stupidlist3([1, 2, 3])
>>> stupidD = stupidlist3([1, 2, 3])
>>> stupidE = stupidlist3([1, 2, 3, 4])
>>> 
>>> stupidC in {stupidD: 0}
True
>>> stupidC in {stupidE: 0}
False
>>> d = {stupidC: 0}
>>> stupidC.append(5)
>>> stupidC in d
True

事情似乎按预期工作,但是请考虑发生了什么:当类的所有实例产生相同的哈希值时,只要一个实例中有两个以上的实例作为键,您就会发生哈希冲突。 dict或存在set

使用my_dict[key]key in my_dict(或item in my_set)需要执行stupidlist3与字典键中实例相同的次数相等的检查(在最坏的情况下)。在这一点上,字典的目的-O(1)查找-被完全击败了。以下时间(使用IPython完成)对此进行了演示。

示例3的一些时间

>>> lists_list = [[i]  for i in range(1000)]
>>> stupidlists_set = {stupidlist3([i]) for i in range(1000)}
>>> tuples_set = {(i,) for i in range(1000)}
>>> l = [999]
>>> s = stupidlist3([999])
>>> t = (999,)
>>> 
>>> %timeit l in lists_list
25.5 µs ± 442 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit s in stupidlists_set
38.5 µs ± 61.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit t in tuples_set
77.6 ns ± 1.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

如您所见,我们的成员资格测试stupidlists_set比整个范围的线性扫描要慢lists_list,而您在一组没有哈希冲突的情况下拥有预期的超快查找时间(因子500)。


TL; DR:您可以将其tuple(yourlist)用作dict键,因为元组是不可变且可哈希的。

Because lists are mutable, dict keys (and set members) need to be hashable, and hashing mutable objects is a bad idea because hash values should be computed on the basis of instance attributes.

In this answer, I will give some concrete examples, hopefully adding value on top of the existing answers. Every insight applies to the elements of the set datastructure as well.

Example 1: hashing a mutable object where the hash value is based on a mutable characteristic of the object.

>>> class stupidlist(list):
...     def __hash__(self):
...         return len(self)
... 
>>> stupid = stupidlist([1, 2, 3])
>>> d = {stupid: 0}
>>> stupid.append(4)
>>> stupid
[1, 2, 3, 4]
>>> d
{[1, 2, 3, 4]: 0}
>>> stupid in d
False
>>> stupid in d.keys()
False
>>> stupid in list(d.keys())
True

After mutating stupid, it cannot be found in the dict any longer because the hash changed. Only a linear scan over the list of the dict’s keys finds stupid.

Example 2: … but why not just a constant hash value?

>>> class stupidlist2(list):
...     def __hash__(self):
...         return id(self)
... 
>>> stupidA = stupidlist2([1, 2, 3])
>>> stupidB = stupidlist2([1, 2, 3])
>>> 
>>> stupidA == stupidB
True
>>> stupidA in {stupidB: 0}
False

That’s not a good idea as well because equal objects should hash identically such that you can find them in a dict or set.

Example 3: … ok, what about constant hashes across all instances?!

>>> class stupidlist3(list):
...     def __hash__(self):
...         return 1
... 
>>> stupidC = stupidlist3([1, 2, 3])
>>> stupidD = stupidlist3([1, 2, 3])
>>> stupidE = stupidlist3([1, 2, 3, 4])
>>> 
>>> stupidC in {stupidD: 0}
True
>>> stupidC in {stupidE: 0}
False
>>> d = {stupidC: 0}
>>> stupidC.append(5)
>>> stupidC in d
True

Things seem to work as expected, but think about what’s happening: when all instances of your class produce the same hash value, you will have a hash collision whenever there are more than two instances as keys in a dict or present in a set.

Finding the right instance with my_dict[key] or key in my_dict (or item in my_set) needs to perform as many equality checks as there are instances of stupidlist3 in the dict’s keys (in the worst case). At this point, the purpose of the dictionary – O(1) lookup – is completely defeated. This is demonstrated in the following timings (done with IPython).

Some Timings for Example 3

>>> lists_list = [[i]  for i in range(1000)]
>>> stupidlists_set = {stupidlist3([i]) for i in range(1000)}
>>> tuples_set = {(i,) for i in range(1000)}
>>> l = [999]
>>> s = stupidlist3([999])
>>> t = (999,)
>>> 
>>> %timeit l in lists_list
25.5 µs ± 442 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit s in stupidlists_set
38.5 µs ± 61.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit t in tuples_set
77.6 ns ± 1.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

As you can see, the membership test in our stupidlists_set is even slower than a linear scan over the whole lists_list, while you have the expected super fast lookup time (factor 500) in a set without loads of hash collisions.


TL; DR: you can use tuple(yourlist) as dict keys, because tuples are immutable and hashable.


回答 7

您问题的简单答案是,类列表未实现方法散列,该散列对于任何希望用作字典中键的对象都是必需的。但是散列的原因不相同方式实现它在说,元组类(基于容器的内容)是因为列表是可变的,以便编辑列表将需要散列重新计算,这可能意味着在列表中现在位于基础哈希表中的错误存储桶中。请注意,由于您无法修改元组(不可变的),因此不会遇到此问题。

附带说明,dictobjects查找的实际实现基于Knuth Vol。的算法D。3秒 6.4。如果您有这本书,那么可能值得一读,此外,如果您真的非常有兴趣,则可以在这里查看开发人员对dictobject实际实现的评论。它详细介绍了它的工作原理。您可能也对感兴趣的字典的实现有一个python讲座。它们遍历了键的定义以及前几分钟的哈希值。

The simple answer to your question is that the class list does not implement the method hash which is required for any object which wishes to be used as a key in a dictionary. However the reason why hash is not implemented the same way it is in say the tuple class (based on the content of the container) is because a list is mutable so editing the list would require the hash to be recalculated which may mean the list in now located in the wrong bucket within the underling hash table. Note that since you cannot modify a tuple (immutable) it doesn’t run into this problem.

As a side note, the actual implementation of the dictobjects lookup is based on Algorithm D from Knuth Vol. 3, Sec. 6.4. If you have that book available to you it might be a worthwhile read, in addition if you’re really, really interested you may like to take a peek at the developer comments on the actual implementation of dictobject here. It goes into great detail as to exactly how it works. There is also a python lecture on the implementation of dictionaries which you may be interested in. They go through the definition of a key and what a hash is in the first few minutes.


回答 8

根据Python 2.7.2文档:

如果对象的哈希值在其生命周期内不发生变化(需要使用hash()方法),并且可以与其他对象进行比较(需要使用eq()或cmp()方法),则该对象是可哈希的。比较相等的可哈希对象必须具有相同的哈希值。

散列性使对象可用作字典键和set成员,因为这些数据结构在内部使用散列值。

Python的所有不可变内置对象都是可哈希的,而没有可变容器(例如列表或字典)是可哈希的。作为用户定义类实例的对象默认情况下是可哈希的;它们都比较不相等,并且其哈希值是其id()。

从不能添加,删除或替换其元素的意义上说,元组是不可变的,但是元素本身可能是可变的。列表的哈希值取决于其元素的哈希值,因此当您更改元素时它也会改变。

对列表散列使用id意味着所有列表的比较方式不同,这将令人惊讶且不便。

According to the Python 2.7.2 documentation:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

A tuple is immutable in the sense that you cannot add, remove or replace its elements, but the elements themselves may be mutable. List’s hash value depends on the hash values of its elements, and so it changes when you change the elements.

Using id’s for list hashes would imply that all lists compare differently, which would be surprising and inconvenient.


回答 9

字典是一个HashMap,它存储您的键的映射,将值转换为哈希的新键以及值映射。

类似于(伪代码):

{key : val}  
hash(key) = val

如果您想知道哪些可用选项可以用作字典的键。然后

任何可散列的内容(可以转换为散列,并保持静态值,即不可变,以形成如上所述的散列键)均符合条件,但是列表或集合对象可以随时随地变化,因此hash(key)也应只是为了与您的列表或集合同步而变化。

你可以试试 :

hash(<your key here>)

如果工作正常,则可以将其用作字典的键,也可以将其转换为可哈希的值。


简而言之 :

  1. 将该列表转换为tuple(<your list>)
  2. 将该列表转换为str(<your list>)

A Dictionary is a HashMap it stores map of your keys, value converted to a hashed new key and value mapping.

something like (psuedo code):

{key : val}  
hash(key) = val

If you are wondering which are available options that can be used as key for your dictionary. Then

anything which is hashable(can be converted to hash, and hold static value i.e immutable so as to make a hashed key as stated above) is eligible but as list or set objects can be vary on the go so hash(key) should also needs to vary just to be in sync with your list or set.

You can try :

hash(<your key here>)

If it works fine it can be used as key for your dictionary or else convert it to something hashable.


Inshort :

  1. Convert that list to tuple(<your list>).
  2. Convert that list to str(<your list>).

回答 10

dict键必须是可哈希的。列表是可变的,它们不提供有效的哈希方法。

dict keys need to be hashable. Lists are Mutable and they do not provide a valid hash method.