问题:将两个列表转换成字典

想象一下您有:

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

产生以下字典的最简单方法是什么?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

Imagine that you have:

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

What is the simplest way to produce the following dictionary?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

回答 0

像这样:

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

Voila :-)成对的dict构造函数和zip函数非常有用:https//docs.python.org/3/library/functions.html#func-dict

Like this:

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

Voila :-) The pairwise dict constructor and zip function are awesomely useful: https://docs.python.org/3/library/functions.html#func-dict


回答 1

想象一下您有:

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

产生以下字典的最简单方法是什么?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

绩效最高的dict构造函数zip

new_dict = dict(zip(keys, values))

在Python 3中,zip现在返回一个惰性迭代器,这是目前性能最高的方法。

dict(zip(keys, values))确实需要为dict和进行一次性全局查找zip,但它不会形成任何不必要的中间数据结构,也不必在函数应用程序中处理局部查找。

亚军,dict理解:

使用dict构造函数的第二名是使用dict理解的本机语法(而不是列表理解,因为其他人错误地将其理解为):

new_dict = {k: v for k, v in zip(keys, values)}

当您需要根据键或值进行映射或过滤时选择此选项。

在Python 2中,zip返回一个列表,以避免创建不必要的列表,请izip改用(别名为zip可以减少移至Python 3时的代码更改)。

from itertools import izip as zip

所以仍然是(2.7):

new_dict = {k: v for k, v in zip(keys, values)}

Python 2,非常适合<= 2.6

izipitertools变为zip在Python 3. izip大于拉链用于Python 2更好(因为它避免了不必要的列表创建),以及理想的2.6或以下:

from itertools import izip
new_dict = dict(izip(keys, values))

所有情况的结果:

在所有情况下:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

说明:

如果我们查看帮助,dict就会发现它采用了多种形式的参数:


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

最佳方法是使用可迭代对象,同时避免创建不必要的数据结构。在Python 2中,zip创建了不必要的列表:

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

在Python 3中,等效项为:

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

Python 3 zip仅创建了一个可迭代的对象:

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

由于我们要避免创建不必要的数据结构,因此我们通常希望避免使用Python 2 zip(因为它创建了不必要的列表)。

性能较差的替代品:

这是一个传递给dict构造函数的生成器表达式:

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

或等效地:

dict((k, v) for k, v in zip(keys, values))

这是一个传递给dict构造函数的列表理解:

dict([(k, v) for k, v in zip(keys, values)])

在前两种情况下,在可迭代的zip上放置了一个额外的非操作(因此是不必要的)计算层,并且在列表理解的情况下,不必要地创建了一个额外的列表。我希望他们所有人的表现都不太好,当然也不会那么好。

绩效考核:

在Ubuntu 16.04上,由Nix提供的64位Python 3.8.2中,从最快到最慢的顺序是:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>> 
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583

dict(zip(keys, values)) 即使使用少量键和值也能获胜,但对于较大的键和值,则性能差异会更大。

评论者说:

min似乎是比较效果的一种坏方法。当然mean和/或max将是更有用的实际使用指标。

我们使用min这些算法是因为它们是确定性的。我们想知道算法在最佳条件下的性能。

如果操作系统由于任何原因挂起,则与我们要比较的内容无关,因此我们需要从分析中排除这些结果。

如果使用mean,这些事件将大大扭曲我们的结果,而如果使用,max我们将只会得到最极端的结果-最有可能受此类事件影响的结果。

评论者还说:

在python 3.6.8中,使用平均值,对dict的理解确实仍然更快,对于这些小列表而言,大约30%。对于较大的列表(10k个随机数),dict通话速度大约快10%。

我想我们的意思是dict(zip(...10k随机数。听起来确实是一个非常不寻常的用例。确实有道理,最直接的调用将在大型数据集中占主导地位,并且考虑到运行该测试将花费多长时间,进而使您的数字发生偏差,如果操作系统挂起占主导地位,我也不会感到惊讶。如果您使用meanmax我会认为您的结果毫无意义。

让我们在上面的示例中使用更实际的尺寸:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

而且我们在这里看到,dict(zip(...对于较大的数据集,确实可以更快地运行约20%。

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

Imagine that you have:

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

What is the simplest way to produce the following dictionary ?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

Most performant, dict constructor with zip

new_dict = dict(zip(keys, values))

In Python 3, zip now returns a lazy iterator, and this is now the most performant approach.

dict(zip(keys, values)) does require the one-time global lookup each for dict and zip, but it doesn’t form any unnecessary intermediate data-structures or have to deal with local lookups in function application.

Runner-up, dict comprehension:

A close runner-up to using the dict constructor is to use the native syntax of a dict comprehension (not a list comprehension, as others have mistakenly put it):

new_dict = {k: v for k, v in zip(keys, values)}

Choose this when you need to map or filter based on the keys or value.

In Python 2, zip returns a list, to avoid creating an unnecessary list, use izip instead (aliased to zip can reduce code changes when you move to Python 3).

from itertools import izip as zip

So that is still (2.7):

new_dict = {k: v for k, v in zip(keys, values)}

Python 2, ideal for <= 2.6

izip from itertools becomes zip in Python 3. izip is better than zip for Python 2 (because it avoids the unnecessary list creation), and ideal for 2.6 or below:

from itertools import izip
new_dict = dict(izip(keys, values))

Result for all cases:

In all cases:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

Explanation:

If we look at the help on dict we see that it takes a variety of forms of arguments:


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

The optimal approach is to use an iterable while avoiding creating unnecessary data structures. In Python 2, zip creates an unnecessary list:

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

In Python 3, the equivalent would be:

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

and Python 3’s zip merely creates an iterable object:

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

Since we want to avoid creating unnecessary data structures, we usually want to avoid Python 2’s zip (since it creates an unnecessary list).

Less performant alternatives:

This is a generator expression being passed to the dict constructor:

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

or equivalently:

dict((k, v) for k, v in zip(keys, values))

And this is a list comprehension being passed to the dict constructor:

dict([(k, v) for k, v in zip(keys, values)])

In the first two cases, an extra layer of non-operative (thus unnecessary) computation is placed over the zip iterable, and in the case of the list comprehension, an extra list is unnecessarily created. I would expect all of them to be less performant, and certainly not more-so.

Performance review:

In 64 bit Python 3.8.2 provided by Nix, on Ubuntu 16.04, ordered from fastest to slowest:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>> 
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583

dict(zip(keys, values)) wins even with small sets of keys and values, but for larger sets, the differences in performance will become greater.

A commenter said:

min seems like a bad way to compare performance. Surely mean and/or max would be much more useful indicators for real usage.

We use min because these algorithms are deterministic. We want to know the performance of the algorithms under the best conditions possible.

If the operating system hangs for any reason, it has nothing to do with what we’re trying to compare, so we need to exclude those kinds of results from our analysis.

If we used mean, those kinds of events would skew our results greatly, and if we used max we will only get the most extreme result – the one most likely affected by such an event.

A commenter also says:

In python 3.6.8, using mean values, the dict comprehension is indeed still faster, by about 30% for these small lists. For larger lists (10k random numbers), the dict call is about 10% faster.

I presume we mean dict(zip(... with 10k random numbers. That does sound like a fairly unusual use case. It does makes sense that the most direct calls would dominate in large datasets, and I wouldn’t be surprised if OS hangs are dominating given how long it would take to run that test, further skewing your numbers. And if you use mean or max I would consider your results meaningless.

Let’s use a more realistic size on our top examples:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

And we see here that dict(zip(... does indeed run faster for larger datasets by about 20%.

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

回答 2

尝试这个:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

在Python 2中,与相比,它在内存消耗方面更经济zip

Try this:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

In Python 2, it’s also more economical in memory consumption compared to zip.


回答 3

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}

回答 4

您还可以在≥2.7的Python中使用字典理解:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

You can also use dictionary comprehensions in Python ≥ 2.7:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

回答 5

一种更自然的方法是使用字典理解

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

A more natural way is to use dictionary comprehension

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

回答 6

如果需要在创建字典之前转换键或值,则可以使用生成器表达式。例:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

看看像Pythonista一样的代码:惯用Python

If you need to transform keys or values before creating a dictionary then a generator expression could be used. Example:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

Take a look Code Like a Pythonista: Idiomatic Python.


回答 7

使用Python 3.x进行dict理解

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

有关dict理解的更多信息,这里有一个示例:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

with Python 3.x, goes for dict comprehensions

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

More on dict comprehensions here, an example is there:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

回答 8

对于那些需要简单代码并且不熟悉的人zip

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

这可以通过一行代码来完成:

d = {List1[n]: List2[n] for n in range(len(List1))}

For those who need simple code and aren’t familiar with zip:

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

This can be done by one line of code:

d = {List1[n]: List2[n] for n in range(len(List1))}

回答 9

  • 2018-04-18

最好的解决方案仍然是:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

整理一下:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')
  • 2018-04-18

The best solution is still:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

Tranpose it:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')

回答 10

您可以使用以下代码:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

但是请确保列表的长度相同。如果长度不相同,则zip函数会将较长的列表进行分类。

you can use this below code:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

But make sure that length of the lists will be same.if length is not same.then zip function turncate the longer one.


回答 11

我在尝试解决与图形相关的问题时有这个疑问。我遇到的问题是我需要定义一个空的邻接列表,并想用一个空列表初始化所有节点,那是当我想到如何检查它是否足够快时,我的意思是说值得进行zip操作而不是简单的分配键值对。在大多数情况下,时间因素是重要的破冰者。因此,我对两种方法都执行了timeit操作。

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

对于n_nodes = 10,000,000我得到了,

迭代:2.825081646999024速记:3.535717916001886

迭代:5.051560923002398速记:6.255070794999483

迭代:6.52859034499852速记:8.221581164998497

迭代:8.683652416999394速记:12.599181543999293

迭代:11.587241565001023速记:15.27298851100204

迭代:14.816342867001367速记:17.162912737003353

迭代:16.645022411001264速记:19.976680120998935

您可以清楚地看到在某一点之后,第n_步的迭代方法超过了第n-1_步的速记方法所花费的时间。

I had this doubt while I was trying to solve a graph-related problem. The issue I had was I needed to define an empty adjacency list and wanted to initialize all the nodes with an empty list, that’s when I thought how about I check if it is fast enough, I mean if it will be worth doing a zip operation rather than simple assignment key-value pair. After all most of the times, the time factor is an important ice breaker. So I performed timeit operation for both approaches.

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

For n_nodes = 10,000,000 I get,

Iteration: 2.825081646999024 Shorthand: 3.535717916001886

Iteration: 5.051560923002398 Shorthand: 6.255070794999483

Iteration: 6.52859034499852 Shorthand: 8.221581164998497

Iteration: 8.683652416999394 Shorthand: 12.599181543999293

Iteration: 11.587241565001023 Shorthand: 15.27298851100204

Iteration: 14.816342867001367 Shorthand: 17.162912737003353

Iteration: 16.645022411001264 Shorthand: 19.976680120998935

You can clearly see after a certain point, iteration approach at n_th step overtakes the time taken by shorthand approach at n-1_th step.


回答 12

这也是在字典中添加列表值的示例

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

始终确保您的“键”(list1)始终在第一个参数中。

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

Here is also an example of adding a list value in you dictionary

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

always make sure the your “Key”(list1) is always in the first parameter.

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

回答 13

作为字典理解的解决方案,带有枚举:

dict = {item : values[index] for index, item in enumerate(keys)}

与枚举循环的解决方案:

dict = {}
for index, item in enumerate(keys):
    dict[item] = values[index]

Solution as dictionary comprehension with enumerate:

dict = {item : values[index] for index, item in enumerate(keys)}

Solution as for loop with enumerate:

dict = {}
for index, item in enumerate(keys):
    dict[item] = values[index]

回答 14

您也可以尝试将两个列表组合在一起的一个列表;)

a = [1,2,3,4]
n = [5,6,7,8]

x = []
for i in a,n:
    x.append(i)

print(dict(zip(x[0], x[1])))

You may also try with one list which is a combination of two lists ;)

a = [1,2,3,4]
n = [5,6,7,8]

x = []
for i in a,n:
    x.append(i)

print(dict(zip(x[0], x[1])))

回答 15

没有zip功能的方法

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

method without zip function

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。