在Python中,如何按已排序的键顺序遍历字典?

问题:在Python中,如何按已排序的键顺序遍历字典?

有一个现有功能以以下结尾,其中d是一个字典:

return d.iteritems()

返回给定字典的未排序迭代器。我想返回一个遍历按key排序的项目的迭代器。我怎么做?

There’s an existing function that ends in the following, where d is a dictionary:

return d.iteritems()

that returns an unsorted iterator for a given dictionary. I would like to return an iterator that goes through the items sorted by key. How do I do that?


回答 0

尚未对此进行广泛的测试,但是可以在Python 2.5.2中使用。

>>> d = {"x":2, "h":15, "a":2222}
>>> it = iter(sorted(d.iteritems()))
>>> it.next()
('a', 2222)
>>> it.next()
('h', 15)
>>> it.next()
('x', 2)
>>>

如果您习惯于使用for key, value in d.iteritems(): ...迭代器而不是迭代器,那么上述方法仍然可以使用

>>> d = {"x":2, "h":15, "a":2222}
>>> for key, value in sorted(d.iteritems()):
>>>     print(key, value)
('a', 2222)
('h', 15)
('x', 2)
>>>

在Python 3.x中,使用d.items()代替d.iteritems()返回迭代器。

Haven’t tested this very extensively, but works in Python 2.5.2.

>>> d = {"x":2, "h":15, "a":2222}
>>> it = iter(sorted(d.iteritems()))
>>> it.next()
('a', 2222)
>>> it.next()
('h', 15)
>>> it.next()
('x', 2)
>>>

If you are used to doing for key, value in d.iteritems(): ... instead of iterators, this will still work with the solution above

>>> d = {"x":2, "h":15, "a":2222}
>>> for key, value in sorted(d.iteritems()):
>>>     print(key, value)
('a', 2222)
('h', 15)
('x', 2)
>>>

With Python 3.x, use d.items() instead of d.iteritems() to return an iterator.


回答 1

使用 sorted()功能:

return sorted(dict.iteritems())

如果您想在排序结果上使用实际的迭代器,由于sorted()返回列表,请使用:

return iter(sorted(dict.iteritems()))

Use the sorted() function:

return sorted(dict.iteritems())

If you want an actual iterator over the sorted results, since sorted() returns a list, use:

return iter(sorted(dict.iteritems()))

回答 2

字典的键存储在哈希表中,这就是它们的“自然顺序”,即伪随机。任何其他顺序都是字典使用者的概念。

sorted()始终返回列表,而不是字典。如果将其传递给dict.items()(将生成一个元组列表),它将返回一个元组列表[[k1,v1),(k2,v2),…],可在循环中使用在某种程度上非常像一个字典,但无论如何它都不是一个字典

foo = {
    'a':    1,
    'b':    2,
    'c':    3,
    }

print foo
>>> {'a': 1, 'c': 3, 'b': 2}

print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]

print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]

以下内容看起来像是循环中的字典,但事实并非如此,它是将元组解压缩为k,v的列表:

for k,v in sorted(foo.items()):
    print k, v

大致相当于:

for k in sorted(foo.keys()):
    print k, foo[k]

A dict’s keys are stored in a hashtable so that is their ‘natural order’, i.e. psuedo-random. Any other ordering is a concept of the consumer of the dict.

sorted() always returns a list, not a dict. If you pass it a dict.items() (which produces a list of tuples), it will return a list of tuples [(k1,v1), (k2,v2), …] which can be used in a loop in a way very much like a dict, but it is not in anyway a dict!

foo = {
    'a':    1,
    'b':    2,
    'c':    3,
    }

print foo
>>> {'a': 1, 'c': 3, 'b': 2}

print foo.items()
>>> [('a', 1), ('c', 3), ('b', 2)]

print sorted(foo.items())
>>> [('a', 1), ('b', 2), ('c', 3)]

The following feels like a dict in a loop, but it’s not, it’s a list of tuples being unpacked into k,v:

for k,v in sorted(foo.items()):
    print k, v

Roughly equivalent to:

for k in sorted(foo.keys()):
    print k, foo[k]

回答 3

格雷格的答案是正确的。请注意,在Python 3.0中,您必须

sorted(dict.items())

iteritems将不复存在。

Greg’s answer is right. Note that in Python 3.0 you’ll have to do

sorted(dict.items())

as iteritems will be gone.


回答 4

您现在也可以OrderedDict在Python 2.7中使用:

>>> from collections import OrderedDict
>>> d = OrderedDict([('first', 1),
...                  ('second', 2),
...                  ('third', 3)])
>>> d.items()
[('first', 1), ('second', 2), ('third', 3)]

在这里,您将获得2.7版本的新功能页面和OrderedDict API

You can now use OrderedDict in Python 2.7 as well:

>>> from collections import OrderedDict
>>> d = OrderedDict([('first', 1),
...                  ('second', 2),
...                  ('third', 3)])
>>> d.items()
[('first', 1), ('second', 2), ('third', 3)]

Here you have the what’s new page for 2.7 version and the OrderedDict API.


回答 5

通常,可以将这样的命令排序为:

for k in sorted(d):
    print k, d[k]

对于问题中的特定情况,对于d.iteritems()具有“替换”功能,请添加以下函数:

def sortdict(d, **opts):
    # **opts so any currently supported sorted() options can be passed
    for k in sorted(d, **opts):
        yield k, d[k]

所以终点线从

return dict.iteritems()

return sortdict(dict)

要么

return sortdict(dict, reverse = True)

In general, one may sort a dict like so:

for k in sorted(d):
    print k, d[k]

For the specific case in the question, having a “drop in replacement” for d.iteritems(), add a function like:

def sortdict(d, **opts):
    # **opts so any currently supported sorted() options can be passed
    for k in sorted(d, **opts):
        yield k, d[k]

and so the ending line changes from

return dict.iteritems()

to

return sortdict(dict)

or

return sortdict(dict, reverse = True)

回答 6

>>> import heapq
>>> d = {"c": 2, "b": 9, "a": 4, "d": 8}
>>> def iter_sorted(d):
        keys = list(d)
        heapq.heapify(keys) # Transforms to heap in O(N) time
        while keys:
            k = heapq.heappop(keys) # takes O(log n) time
            yield (k, d[k])


>>> i = iter_sorted(d)
>>> for x in i:
        print x


('a', 4)
('b', 9)
('c', 2)
('d', 8)

此方法仍然具有O(N log N)排序,但是,经过短暂的线性堆化后,它会按排序顺序生成项目,从理论上讲,当您不总是需要整个列表时,它会更加高效。

>>> import heapq
>>> d = {"c": 2, "b": 9, "a": 4, "d": 8}
>>> def iter_sorted(d):
        keys = list(d)
        heapq.heapify(keys) # Transforms to heap in O(N) time
        while keys:
            k = heapq.heappop(keys) # takes O(log n) time
            yield (k, d[k])


>>> i = iter_sorted(d)
>>> for x in i:
        print x


('a', 4)
('b', 9)
('c', 2)
('d', 8)

This method still has an O(N log N) sort, however, after a short linear heapify, it yields the items in sorted order as it goes, making it theoretically more efficient when you do not always need the whole list.


回答 7

如果要按插入项的顺序而不是键的顺序进行排序,则应查看Python的collections.OrderedDict。(仅适用于Python 3)

If you want to sort by the order that items were inserted instead of of the order of the keys, you should have a look to Python’s collections.OrderedDict. (Python 3 only)


回答 8

sorted返回一个列表,因此在尝试对其进行迭代时会出错,但是由于无法订购字典,因此必须处理列表。

我不知道您的代码的较大上下文是什么,但是您可以尝试将迭代器添加到结果列表中。像这样吗?:

return iter(sorted(dict.iteritems()))

当然,您现在将返回元组,因为排序使您的字典变成了元组列表

例如:说您的字典是: {'a':1,'c':3,'b':2} 排序后将其变成一个列表:

[('a',1),('b',2),('c',3)]

因此,当您实际遍历该列表时,您会返回(在本示例中)一个由字符串和整数组成的元组,但是至少您可以对它进行遍历。

sorted returns a list, hence your error when you try to iterate over it, but because you can’t order a dict you will have to deal with a list.

I have no idea what the larger context of your code is, but you could try adding an iterator to the resulting list. like this maybe?:

return iter(sorted(dict.iteritems()))

of course you will be getting back tuples now because sorted turned your dict into a list of tuples

ex: say your dict was: {'a':1,'c':3,'b':2} sorted turns it into a list:

[('a',1),('b',2),('c',3)]

so when you actually iterate over the list you get back (in this example) a tuple composed of a string and an integer, but at least you will be able to iterate over it.


回答 9

假设您正在使用CPython 2.x并拥有一个较大的字典mydict,那么使用sorted(mydict)将会很慢,因为sorted会建立mydict键的排序列表。

在那种情况下,您可能要看一下我的orderdict包,其中包括sorteddictin C 的C实现。尤其是如果您必须在字典生命周期的不同阶段(即元素数)多次遍历键的排序列表时,请注意。

http://anthon.home.xs4all.nl/Python/ordereddict/

Assuming you are using CPython 2.x and have a large dictionary mydict, then using sorted(mydict) is going to be slow because sorted builds a sorted list of the keys of mydict.

In that case you might want to look at my ordereddict package which includes a C implementation of sorteddict in C. Especially if you have to go over the sorted list of keys multiple times at different stages (ie. number of elements) of the dictionaries lifetime.

http://anthon.home.xs4all.nl/Python/ordereddict/