什么是字典视图对象?

问题:什么是字典视图对象?

在python 2.7中,我们获得了可用的字典视图方法

现在,我知道以下内容的优缺点:

  • dict.items()(和valueskeys):返回一个列表,以便您可以实际存储结果,并且
  • dict.iteritems() (等等):返回一个生成器,因此您可以逐个迭代生成的每个值。

有什么用dict.viewitems()(等等)?他们有什么好处?它是如何工作的?到底是什么看法?

我读到该视图始终反映字典中的变化。但是,从性能和内存的角度来看,它的表现如何?优点和缺点是什么?

In python 2.7, we got the dictionary view methods available.

Now, I know the pro and cons of the following:

  • dict.items() (and values, keys): returns a list, so you can actually store the result, and
  • dict.iteritems() (and the like): returns a generator, so you can iterate over each value generated one by one.

What are dict.viewitems() (and the like) for? What are their benefits? How does it work? What is a view after all?

I read that the view is always reflecting the changes from the dictionary. But how does it behave from the perf and memory point of view? What are the pro and cons?


回答 0

字典视图本质上就是它们的名字所说的:视图就像是字典的键和值(或项)上的窗口。这是Python 3 官方文档的摘录:

>>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}
>>> keys = dishes.keys()
>>> values = dishes.values()

>>> # view objects are dynamic and reflect dict changes
>>> del dishes['eggs']
>>> keys  # No eggs anymore!
dict_keys(['sausage', 'bacon', 'spam'])

>>> values  # No eggs value (2) anymore!
dict_values([1, 1, 500])

(Python 2等效项使用dishes.viewkeys()dishes.viewvalues()。)

此示例显示了视图动态特性:按键视图不是给定时间点的按键副本,而是一个简单的窗口,向您显示按键;如果它们被更改,那么您在窗口中看到的内容也会发生更改。此功能在某些情况下很有用(例如,可以在程序的多个部分中使用键视图,而不必每次都需要重新计算当前键列表)—请注意,如果修改了字典键在视图上进行迭代时,迭代器的行为方式未明确定义,这可能会导致错误

一个优点是,例如,查看键仅使用少量且固定的内存,并且需要少量且固定的处理器时间,因为没有创建键列表(另一方面,Python 2,通常会不必要地创建一个新列表,如Rajendran T所引用的那样,该列表占用的内存和时间与列表的长度成比例。要继续进行窗口类比,如果您想查看墙后的风景,只需在其中开一个洞(您就可以建立一个窗口);将关键帧复制到列表中将相当于在墙上绘制风景的副本-该副本需要时间,空间并且不会自我更新。

总而言之,视图只是…词典上的视图(窗口),即使词典发生更改,视图也会显示该词典的内容。它们提供的功能与列表不同:键的列表包含给定时间点的字典键的副本,而视图是动态的并且获取起来要快得多,因为它无需复制任何数据(键或值)以进行创建。

Dictionary views are essentially what their name says: views are simply like a window on the keys and values (or items) of a dictionary. Here is an excerpt from the official documentation for Python 3:

>>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}
>>> keys = dishes.keys()
>>> values = dishes.values()

>>> # view objects are dynamic and reflect dict changes
>>> del dishes['eggs']
>>> keys  # No eggs anymore!
dict_keys(['sausage', 'bacon', 'spam'])

>>> values  # No eggs value (2) anymore!
dict_values([1, 1, 500])

(The Python 2 equivalent uses dishes.viewkeys() and dishes.viewvalues().)

This example shows the dynamic character of views: the keys view is not a copy of the keys at a given point in time, but rather a simple window that shows you the keys; if they are changed, then what you see through the window does change as well. This feature can be useful in some circumstances (for instance, one can work with a view on the keys in multiple parts of a program instead of recalculating the current list of keys each time they are needed)—note that if the dictionary keys are modified while iterating over the view, how the iterator should behave is not well defined, which can lead to errors.

One advantage is that looking at, say, the keys uses only a small and fixed amount of memory and requires a small and fixed amount of processor time, as there is no creation of a list of keys (Python 2, on the other hand, often unnecessarily creates a new list, as quoted by Rajendran T, which takes memory and time in an amount proportional to the length of the list). To continue the window analogy, if you want to see a landscape behind a wall, you simply make an opening in it (you build a window); copying the keys into a list would correspond to instead painting a copy of the landscape on your wall—the copy takes time, space, and does not update itself.

To summarize, views are simply… views (windows) on your dictionary, which show the contents of the dictionary even after it changes. They offer features that differ from those of lists: a list of keys contain a copy of the dictionary keys at a given point in time, while a view is dynamic and is much faster to obtain, as it does not have to copy any data (keys or values) in order to be created.


回答 1

如前所述,dict.items()返回字典的(键,值)对列表的副本是浪费的,并且dict.iteritems()返回对字典的(键,值)对的迭代器。

现在以以下示例为例,看看dict的插入器和dict的视图之间的区别

>>> d = {"x":5, "y":3}
>>> iter = d.iteritems()
>>> del d["x"]
>>> for i in iter: print i
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

而视图只是向您显示字典中的内容。不管它是否更改:

>>> d = {"x":5, "y":3}
>>> v = d.viewitems()
>>> v
dict_items([('y', 3), ('x', 5)])
>>> del d["x"]
>>> v
dict_items([('y', 3)])

视图只是字典现在的样子。删除后,条目.items()将是过时的并且.iteritems()将引发错误。

As you mentioned dict.items() returns a copy of the dictionary’s list of (key, value) pairs which is wasteful and dict.iteritems() returns an iterator over the dictionary’s (key, value) pairs.

Now take the following example to see the difference between an interator of dict and a view of dict

>>> d = {"x":5, "y":3}
>>> iter = d.iteritems()
>>> del d["x"]
>>> for i in iter: print i
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

Whereas a view simply shows you what’s in the dict. It doesn’t care if it changed:

>>> d = {"x":5, "y":3}
>>> v = d.viewitems()
>>> v
dict_items([('y', 3), ('x', 5)])
>>> del d["x"]
>>> v
dict_items([('y', 3)])

A view is simply a what the dictionary looks like now. After deleting an entry .items() would have been out-of-date and .iteritems() would have thrown an error.


回答 2

只是通过阅读文档,我得到的印象是:

  1. 视图是“类伪集”,因为它们不支持索引编制,因此您可以对它们进行测试,以测试成员资格并对其进行迭代(因为键是可哈希的且唯一的,因此键和项视图更加“集样”,因为它们不包含重复项)。
  2. 您可以存储它们并多次使用它们,例如列表版本。
  3. 因为它们反映了基础字典,所以字典中的任何更改都会更改视图,并且几乎可以肯定会更改迭代顺序。因此,与列表版本不同,它们不是“稳定的”。
  4. 因为它们反映了基础词典,所以几乎可以肯定它们是小型代理对象;复制键/值/项目将要求他们以某种方式观看原始字典,并在发生更改时将其复制多次,这将是荒谬的实现。因此,我希望内存开销很小,但是访问比直接访问字典要慢一些。

所以我想关键用例是,如果您要保留一个字典并反复修改其键/项/值,并在其间进行修改。你可以只使用一个视图代替,把for k, v in mydict.iteritems():for k, v in myview:。但是,如果只对字典进行一次迭代,我认为迭代版本仍然是可取的。

Just from reading the docs I get this impression:

  1. Views are “pseudo-set-like”, in that they don’t support indexing, so what you can do with them is test for membership and iterate over them (because keys are hashable and unique, the keys and items views are more “set-like” in that they don’t contain duplicates).
  2. You can store them and use them multiple times, like the list versions.
  3. Because they reflect the underlying dictionary, any change in the dictionary will change the view, and will almost certainly change the order of iteration. So unlike the list versions, they’re not “stable”.
  4. Because they reflect the underlying dictionary, they’re almost certainly small proxy objects; copying the keys/values/items would require that they watch the original dictionary somehow and copy it multiple times when changes happen, which would be an absurd implementation. So I would expect very little memory overhead, but access to be a little slower than directly to the dictionary.

So I guess the key usecase is if you’re keeping a dictionary around and repeatedly iterating over its keys/items/values with modifications in between. You could just use a view instead, turning for k, v in mydict.iteritems(): into for k, v in myview:. But if you’re just iterating over the dictionary once, I think the iter- versions are still preferable.


回答 3

view方法返回一个列表(与和相比.keys(),不是列表的副本),因此它更轻巧,但反映了字典的当前内容。.items().values()

Python 3.0开始-dict方法返回视图-为什么?

主要原因是,在许多用例中,返回完全分离的列表是不必要且浪费的。这将需要复制整个内容(可能很多,也可能很多)。

如果只想遍历键,则无需创建新列表。如果确实需要将其作为单独的列表(作为副本),则可以从视图轻松创建该列表。

The view methods return a list(not a copy of the list, compared to .keys(), .items() and .values()), so it is more lightweight, but reflects the current contents of dictionary.

From Python 3.0 – dict methods return views – why?

The main reason is that for many use cases returning a completely detached list is unnecessary and wasteful. It would require copying the entire content (which may or many not be a lot).

If you simply want to iterate over the keys then creating a new list is not necessary. And if you indeed need it as a separate list (as a copy) then you can easily create that list from the view.


回答 4

视图使您可以访问底层数据结构,而无需复制它。除了动态而不是创建列表外,in测试最有用的用途之一是测试。假设您要检查dict中是否包含一个值(它是键还是值)。

选项一是使用创建键列表dict.keys(),这可以工作,但显然会占用更多内存。如果dict非常大?那会很浪费。

有了它,views您可以迭代实际的数据结构,而无需中间列表。

让我们使用示例。我有一个带有1000个随机字符串和数字键的字典,这k是我要查找的键

large_d = { .. 'NBBDC': '0RMLH', 'E01AS': 'UAZIQ', 'G0SSL': '6117Y', 'LYBZ7': 'VC8JQ' .. }

>>> len(large_d)
1000

# this is one option; It creates the keys() list every time, it's here just for the example
timeit.timeit('k in large_d.keys()', setup='from __main__ import large_d, k', number=1000000)
13.748743600954867


# now let's create the list first; only then check for containment
>>> list_keys = large_d.keys()
>>> timeit.timeit('k in list_keys', setup='from __main__ import large_d, k, list_keys', number=1000000)
8.874809793833492


# this saves us ~5 seconds. Great!
# let's try the views now
>>> timeit.timeit('k in large_d.viewkeys()', setup='from __main__ import large_d, k', number=1000000)
0.08828549011070663

# How about saving another 8.5 seconds?

如您所见,迭代view对象极大地提高了性能,同时减少了内存开销。需要执行Set类似操作时,应使用它们。

注意:我在Python 2.7上运行

Views let you access the underlaying data structure, without copying it. Besides being dynamic as opposed to creating a list, one of their most useful usage is in test. Say you want to check if a value is in the dict or not (either it be key or value).

Option one is to create a list of the keys using dict.keys(), this works but obviously consumes more memory. If the dict is very large? That would be wasteful.

With views you can iterate the actual data-structure, without intermediate list.

Let’s use examples. I’ve a dict with 1000 keys of random strings and digits and k is the key I want to look for

large_d = { .. 'NBBDC': '0RMLH', 'E01AS': 'UAZIQ', 'G0SSL': '6117Y', 'LYBZ7': 'VC8JQ' .. }

>>> len(large_d)
1000

# this is one option; It creates the keys() list every time, it's here just for the example
timeit.timeit('k in large_d.keys()', setup='from __main__ import large_d, k', number=1000000)
13.748743600954867


# now let's create the list first; only then check for containment
>>> list_keys = large_d.keys()
>>> timeit.timeit('k in list_keys', setup='from __main__ import large_d, k, list_keys', number=1000000)
8.874809793833492


# this saves us ~5 seconds. Great!
# let's try the views now
>>> timeit.timeit('k in large_d.viewkeys()', setup='from __main__ import large_d, k', number=1000000)
0.08828549011070663

# How about saving another 8.5 seconds?

As you can see, iterating view object gives a huge boost to performance, reducing memory overhead at the same time. You should use them when you need to perform Set like operations.

Note: I’m running on Python 2.7