问题:如何按字典值对字典列表进行排序?
我有一个字典列表,希望每个项目都按特定的属性值排序。
考虑下面的数组,
[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
当排序name
,应该成为
[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]
I have a list of dictionaries and want each item to be sorted by a specific property values.
Take into consideration the array below,
[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
When sorted by name
, should become
[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]
回答 0
使用密钥而不是cmp看起来更干净:
newlist = sorted(list_to_be_sorted, key=lambda k: k['name'])
或如JFSebastian和其他人所建议的,
from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name'))
为了完整性(如fitzgeraldsteele的评论中指出的那样),请添加reverse=True
降序排列
newlist = sorted(l, key=itemgetter('name'), reverse=True)
It may look cleaner using a key instead a cmp:
newlist = sorted(list_to_be_sorted, key=lambda k: k['name'])
or as J.F.Sebastian and others suggested,
from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name'))
For completeness (as pointed out in comments by fitzgeraldsteele), add reverse=True
to sort descending
newlist = sorted(l, key=itemgetter('name'), reverse=True)
回答 1
import operator
通过key =’name’对字典列表进行排序:
list_of_dicts.sort(key=operator.itemgetter('name'))
按照key =’age’对字典列表进行排序:
list_of_dicts.sort(key=operator.itemgetter('age'))
import operator
To sort the list of dictionaries by key=’name’:
list_of_dicts.sort(key=operator.itemgetter('name'))
To sort the list of dictionaries by key=’age’:
list_of_dicts.sort(key=operator.itemgetter('age'))
回答 2
my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
my_list.sort(lambda x,y : cmp(x['name'], y['name']))
my_list
现在将成为您想要的。
(3年后)进行编辑以添加:
新的key
论点更加有效和整洁。更好的答案现在看起来像:
my_list = sorted(my_list, key=lambda k: k['name'])
…IMO比operator.itemgetter
ymmv 更容易理解。
my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
my_list.sort(lambda x,y : cmp(x['name'], y['name']))
my_list
will now be what you want.
(3 years later) Edited to add:
The new key
argument is more efficient and neater. A better answer now looks like:
my_list = sorted(my_list, key=lambda k: k['name'])
…the lambda is, IMO, easier to understand than operator.itemgetter
, but YMMV.
回答 3
如果要按多个键对列表进行排序,可以执行以下操作:
my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))
它相当骇人听闻,因为它依赖于将值转换为单个字符串表示形式进行比较,但是它对于包括负数在内的数字也可以正常工作(尽管如果使用数字,则需要使用零填充来适当格式化字符串)
If you want to sort the list by multiple keys you can do the following:
my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))
It is rather hackish, since it relies on converting the values into a single string representation for comparison, but it works as expected for numbers including negative ones (although you will need to format your string appropriately with zero paddings if you are using numbers)
回答 4
import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))
‘key’用于按任意值排序,’itemgetter’将该值设置为每个项目的’name’属性。
import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))
‘key’ is used to sort by an arbitrary value and ‘itemgetter’ sets that value to each item’s ‘name’ attribute.
回答 5
a = [{'name':'Homer', 'age':39}, ...]
# This changes the list a
a.sort(key=lambda k : k['name'])
# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name'])
a = [{'name':'Homer', 'age':39}, ...]
# This changes the list a
a.sort(key=lambda k : k['name'])
# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name'])
回答 6
我想你的意思是:
[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
排序如下:
sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))
I guess you’ve meant:
[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
This would be sorted like this:
sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))
回答 7
您可以使用自定义比较函数,也可以传入一个计算自定义排序键的函数。通常,这样做效率更高,因为每个项只计算一次密钥,而比较函数将被调用多次。
您可以这样进行:
def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)
但是标准库包含用于获取任意对象项的通用例程:itemgetter
。因此,请尝试以下操作:
from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))
You could use a custom comparison function, or you could pass in a function that calculates a custom sort key. That’s usually more efficient as the key is only calculated once per item, while the comparison function would be called many more times.
You could do it this way:
def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)
But the standard library contains a generic routine for getting items of arbitrary objects: itemgetter
. So try this instead:
from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))
回答 8
使用Perl的Schwartzian变换,
py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
做
sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]
给
>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]
有关Perl Schwartzian变换的更多信息
在计算机科学中,Schwartzian变换是一种Perl编程习惯用法,用于提高对项目列表进行排序的效率。当排序实际上是基于元素的某个属性(键)的排序时,此惯用法适用于基于比较的排序,其中计算该属性是一项应执行最少次数的密集操作。Schwartzian转换的显着之处在于它不使用命名的临时数组。
Using Schwartzian transform from Perl,
py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
do
sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]
gives
>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]
More on Perl Schwartzian transform
In computer science, the Schwartzian transform is a Perl programming
idiom used to improve the efficiency of sorting a list of items. This
idiom is appropriate for comparison-based sorting when the ordering is
actually based on the ordering of a certain property (the key) of the
elements, where computing that property is an intensive operation that
should be performed a minimal number of times. The Schwartzian
Transform is notable in that it does not use named temporary arrays.
回答 9
回答 10
有时我们需要使用lower()
例如
lists = [{'name':'Homer', 'age':39},
{'name':'Bart', 'age':10},
{'name':'abby', 'age':9}]
lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]
lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]
sometime we need to use lower()
for example
lists = [{'name':'Homer', 'age':39},
{'name':'Bart', 'age':10},
{'name':'abby', 'age':9}]
lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]
lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]
回答 11
这是另一种通用解决方案-它按键和值对dict的元素进行排序。它的优点-无需指定键,并且如果某些词典中缺少某些键,它将仍然有效。
def sort_key_func(item):
""" helper function used to sort list of dicts
:param item: dict
:return: sorted list of tuples (k, v)
"""
pairs = []
for k, v in item.items():
pairs.append((k, v))
return sorted(pairs)
sorted(A, key=sort_key_func)
Here is the alternative general solution – it sorts elements of dict by keys and values.
The advantage of it – no need to specify keys, and it would still work if some keys are missing in some of dictionaries.
def sort_key_func(item):
""" helper function used to sort list of dicts
:param item: dict
:return: sorted list of tuples (k, v)
"""
pairs = []
for k, v in item.items():
pairs.append((k, v))
return sorted(pairs)
sorted(A, key=sort_key_func)
回答 12
使用pandas包是另一种方法,尽管它的大规模运行比其他人提出的更传统的方法要慢得多:
import pandas as pd
listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()
以下是一些小型词典和大型(100k +)字典的一些基准值:
setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"
setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"
method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"
import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))
t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))
#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807
Using the pandas package is another method, though it’s runtime at large scale is much slower than the more traditional methods proposed by others:
import pandas as pd
listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()
Here are some benchmark values for a tiny list and a large (100k+) list of dicts:
setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"
setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"
method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"
import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))
t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))
#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807
回答 13
如果你不需要原来list
的dictionaries
,你可以用修改就地sort()
使用自定义按键功能的方法。
按键功能:
def get_name(d):
""" Return the value of a key in a dictionary. """
return d["name"]
该list
进行排序:
data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
就地排序:
data_one.sort(key=get_name)
如果您需要原始的list
,请调用将sorted()
函数传递给的函数list
和键函数,然后将返回的排序list
后的变量分配给新变量:
data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)
印刷data_one
和new_data
。
>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
If you do not need the original list
of dictionaries
, you could modify it in-place with sort()
method using a custom key function.
Key function:
def get_name(d):
""" Return the value of a key in a dictionary. """
return d["name"]
The list
to be sorted:
data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
Sorting it in-place:
data_one.sort(key=get_name)
If you need the original list
, call the sorted()
function passing it the list
and the key function, then assign the returned sorted list
to a new variable:
data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)
Printing data_one
and new_data
.
>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
回答 14
假设我有一本D
包含以下内容的字典。要进行排序,只需使用sort中的key参数来传递自定义函数,如下所示:
D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
return tuple[1]
sorted(D.items(), key = get_count, reverse=True)
# or
sorted(D.items(), key = lambda x: x[1], reverse=True) # avoiding get_count function call
检查这个出来。
Let’s say I have a dictionary D
with elements below. To sort just use key argument in sorted to pass custom function as below :
D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
return tuple[1]
sorted(D.items(), key = get_count, reverse=True)
# or
sorted(D.items(), key = lambda x: x[1], reverse=True) # avoiding get_count function call
Check this out.
回答 15
我一直是lambda过滤器的忠实拥护者,但是如果您考虑时间复杂性,则不是最佳选择
第一选择
sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# returns list of values
第二选择
list_to_sort.sort(key=operator.itemgetter('name'))
#edits the list, does not return a new list
快速比较执行时间
# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"
1000000次循环,最好为3:每个循环0.736微秒
# Second option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"
1000000次循环,最好为3:每个循环0.438微秒
I have been a big fan of filter w/ lambda however it is not best option if you considering time complexity
First option
sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# returns list of values
Second option
list_to_sort.sort(key=operator.itemgetter('name'))
#edits the list, does not return a new list
Fast comparison of exec times
# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"
1000000 loops, best of 3: 0.736 usec per loop
# Second option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"
1000000 loops, best of 3: 0.438 usec per loop
回答 16
如果需要考虑性能,我会使用内置函数operator.itemgetter
来代替lambda
手工函数,而使用内置函数来代替。该itemgetter
功能似乎比lambda
根据我的测试快约20%。
从https://wiki.python.org/moin/PythonSpeed:
同样,内置函数比手工生成的等效函数运行得更快。例如,map(operator.add,v1,v2)比map(lambda x,y:x + y,v1,v2)快。
这是使用lambda
vs 进行排序速度的比较itemgetter
。
import random
import operator
# create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]
# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Check that each technique produces same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True
两种技术都以相同的顺序对列表进行排序(通过执行代码块中的final语句进行验证),但是一种方法要快一些。
If performance is a concern, I would use operator.itemgetter
instead of lambda
as built-in functions perform faster than hand-crafted functions. The itemgetter
function seems to perform approximately 20% faster than lambda
based on my testing.
From https://wiki.python.org/moin/PythonSpeed:
Likewise, the builtin functions run faster than hand-built equivalents. For example, map(operator.add, v1, v2) is faster than map(lambda x,y: x+y, v1, v2).
Here is a comparison of sorting speed using lambda
vs itemgetter
.
import random
import operator
# create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]
# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Check that each technique produces same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True
Both techniques sort the list in the same order (verified by execution of the final statement in the code block) but one is a little faster.
回答 17
您可以使用以下代码
sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])
You may use the following code
sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])