标签归档:dictionary

反转/反转字典映射

问题:反转/反转字典映射

给定这样的字典:

my_map = {'a': 1, 'b': 2}

如何将这张地图倒置即可:

inv_map = {1: 'a', 2: 'b'}

Given a dictionary like so:

my_map = {'a': 1, 'b': 2}

How can one invert this map to get:

inv_map = {1: 'a', 2: 'b'}

回答 0

对于Python 2.7.x

inv_map = {v: k for k, v in my_map.iteritems()}

对于Python 3+:

inv_map = {v: k for k, v in my_map.items()}

For Python 2.7.x

inv_map = {v: k for k, v in my_map.iteritems()}

For Python 3+:

inv_map = {v: k for k, v in my_map.items()}

回答 1

假设字典中的值是唯一的:

dict((v, k) for k, v in my_map.iteritems())

Assuming that the values in the dict are unique:

dict((v, k) for k, v in my_map.iteritems())

回答 2

如果中的值my_map不是唯一的:

inv_map = {}
for k, v in my_map.iteritems():
    inv_map[v] = inv_map.get(v, [])
    inv_map[v].append(k)

If the values in my_map aren’t unique:

inv_map = {}
for k, v in my_map.iteritems():
    inv_map[v] = inv_map.get(v, [])
    inv_map[v].append(k)

回答 3

为此,同时保留映射类型(假设它是a dictdict子类):

def inverse_mapping(f):
    return f.__class__(map(reversed, f.items()))

To do this while preserving the type of your mapping (assuming that it is a dict or a dict subclass):

def inverse_mapping(f):
    return f.__class__(map(reversed, f.items()))

回答 4

尝试这个:

inv_map = dict(zip(my_map.values(), my_map.keys()))

(请注意,字典视图上的Python文档明确地保证了这一点,.keys()并且.values()其元素具有相同的顺序,这使得上述方法可以工作。)

或者:

inv_map = dict((my_map[k], k) for k in my_map)

或使用python 3.0的dict理解

inv_map = {my_map[k] : k for k in my_map}

Try this:

inv_map = dict(zip(my_map.values(), my_map.keys()))

(Note that the Python docs on dictionary views explicitly guarantee that .keys() and .values() have their elements in the same order, which allows the approach above to work.)

Alternatively:

inv_map = dict((my_map[k], k) for k in my_map)

or using python 3.0’s dict comprehensions

inv_map = {my_map[k] : k for k in my_map}

回答 5

另一种更实用的方法:

my_map = { 'a': 1, 'b':2 }
dict(map(reversed, my_map.items()))

Another, more functional, way:

my_map = { 'a': 1, 'b':2 }
dict(map(reversed, my_map.items()))

回答 6

这扩展了Robert的答案,适用于字典中的值不是唯一的情况。

class ReversibleDict(dict):

    def reversed(self):
        """
        Return a reversed dict, with common values in the original dict
        grouped into a list in the returned dict.

        Example:
        >>> d = ReversibleDict({'a': 3, 'c': 2, 'b': 2, 'e': 3, 'd': 1, 'f': 2})
        >>> d.reversed()
        {1: ['d'], 2: ['c', 'b', 'f'], 3: ['a', 'e']}
        """

        revdict = {}
        for k, v in self.iteritems():
            revdict.setdefault(v, []).append(k)
        return revdict

实现受到限制,因为您不能使用reversed两次并取回原始文件。因此它不是对称的。已通过Python 2.6测试。是一个我用来打印结果字典的用例。

如果您宁愿使用a而set不是a list,并且可能存在对此有意义的无序应用程序,而不是setdefault(v, []).append(k)use setdefault(v, set()).add(k)

This expands upon the answer by Robert, applying to when the values in the dict aren’t unique.

class ReversibleDict(dict):

    def reversed(self):
        """
        Return a reversed dict, with common values in the original dict
        grouped into a list in the returned dict.

        Example:
        >>> d = ReversibleDict({'a': 3, 'c': 2, 'b': 2, 'e': 3, 'd': 1, 'f': 2})
        >>> d.reversed()
        {1: ['d'], 2: ['c', 'b', 'f'], 3: ['a', 'e']}
        """

        revdict = {}
        for k, v in self.iteritems():
            revdict.setdefault(v, []).append(k)
        return revdict

The implementation is limited in that you cannot use reversed twice and get the original back. It is not symmetric as such. It is tested with Python 2.6. Here is a use case of how I am using to print the resultant dict.

If you’d rather use a set than a list, and there could exist unordered applications for which this makes sense, instead of setdefault(v, []).append(k), use setdefault(v, set()).add(k).


回答 7

我们还可以使用重复键反转字典defaultdict

from collections import Counter, defaultdict

def invert_dict(d):
    d_inv = defaultdict(list)
    for k, v in d.items():
        d_inv[v].append(k)
    return d_inv

text = 'aaa bbb ccc ddd aaa bbb ccc aaa' 
c = Counter(text.split()) # Counter({'aaa': 3, 'bbb': 2, 'ccc': 2, 'ddd': 1})
dict(invert_dict(c)) # {1: ['ddd'], 2: ['bbb', 'ccc'], 3: ['aaa']}  

这里

与使用的等效技术相比,此技术更简单,更快dict.setdefault()

We can also reverse a dictionary with duplicate keys using defaultdict:

from collections import Counter, defaultdict

def invert_dict(d):
    d_inv = defaultdict(list)
    for k, v in d.items():
        d_inv[v].append(k)
    return d_inv

text = 'aaa bbb ccc ddd aaa bbb ccc aaa' 
c = Counter(text.split()) # Counter({'aaa': 3, 'bbb': 2, 'ccc': 2, 'ddd': 1})
dict(invert_dict(c)) # {1: ['ddd'], 2: ['bbb', 'ccc'], 3: ['aaa']}  

See here:

This technique is simpler and faster than an equivalent technique using dict.setdefault().


回答 8

例如,您有以下字典:

dict = {'a': 'fire', 'b': 'ice', 'c': 'fire', 'd': 'water'}

而且您想以相反的形式获取它:

inverted_dict = {'fire': ['a', 'c'], 'ice': ['b'], 'water': ['d']}

第一个解决方案。要在字典中反转键值对,请使用for-loop方法:

# Use this code to invert dictionaries that have non-unique values

inverted_dict = dict()
for key, value in dict.items():
    inverted_dict.setdefault(value, list()).append(key)

第二解决方案。使用字典理解方法进行反演:

# Use this code to invert dictionaries that have unique values

inverted_dict = {value: key for key, value in dict.items()}

第三解。使用还原反转方法(取决于第二种解决方案):

# Use this code to invert dictionaries that have lists of values

dict = {value: key for key in inverted_dict for value in my_map[key]}

For instance, you have the following dictionary:

dict = {'a': 'fire', 'b': 'ice', 'c': 'fire', 'd': 'water'}

And you wanna get it in such an inverted form:

inverted_dict = {'fire': ['a', 'c'], 'ice': ['b'], 'water': ['d']}

First Solution. For inverting key-value pairs in your dictionary use a for-loop approach:

# Use this code to invert dictionaries that have non-unique values

inverted_dict = dict()
for key, value in dict.items():
    inverted_dict.setdefault(value, list()).append(key)

Second Solution. Use a dictionary comprehension approach for inversion:

# Use this code to invert dictionaries that have unique values

inverted_dict = {value: key for key, value in dict.items()}

Third Solution. Use reverting the inversion approach (relies on second solution):

# Use this code to invert dictionaries that have lists of values

dict = {value: key for key in inverted_dict for value in my_map[key]}

回答 9

列表和字典理解的结合。可以处理重复的钥匙

{v:[i for i in d.keys() if d[i] == v ] for k,v in d.items()}

Combination of list and dictionary comprehension. Can handle duplicate keys

{v:[i for i in d.keys() if d[i] == v ] for k,v in d.items()}

回答 10

如果值不是唯一的,并且您有点硬核:

inv_map = dict(
    (v, [k for (k, xx) in filter(lambda (key, value): value == v, my_map.items())]) 
    for v in set(my_map.values())
)

特别是对于大字典,请注意,此解决方案的效率远不及Python反向/反转映射的答案,因为它会循环items()多次。

If the values aren’t unique, and you’re a little hardcore:

inv_map = dict(
    (v, [k for (k, xx) in filter(lambda (key, value): value == v, my_map.items())]) 
    for v in set(my_map.values())
)

Especially for a large dict, note that this solution is far less efficient than the answer Python reverse / invert a mapping because it loops over items() multiple times.


回答 11

除了上面建议的其他功能之外,如果您喜欢lambdas:

invert = lambda mydict: {v:k for k, v in mydict.items()}

或者,您也可以采用这种方式:

invert = lambda mydict: dict( zip(mydict.values(), mydict.keys()) )

In addition to the other functions suggested above, if you like lambdas:

invert = lambda mydict: {v:k for k, v in mydict.items()}

Or, you could do it this way too:

invert = lambda mydict: dict( zip(mydict.values(), mydict.keys()) )

回答 12

我认为最好的方法是定义一个类。这是“对称字典”的实现:

class SymDict:
    def __init__(self):
        self.aToB = {}
        self.bToA = {}

    def assocAB(self, a, b):
        # Stores and returns a tuple (a,b) of overwritten bindings
        currB = None
        if a in self.aToB: currB = self.bToA[a]
        currA = None
        if b in self.bToA: currA = self.aToB[b]

        self.aToB[a] = b
        self.bToA[b] = a
        return (currA, currB)

    def lookupA(self, a):
        if a in self.aToB:
            return self.aToB[a]
        return None

    def lookupB(self, b):
        if b in self.bToA:
            return self.bToA[b]
        return None

如果需要,删除和迭代方法很容易实现。

与反转整个字典(这似乎是此页面上最流行的解决方案)相比,这种实现方式效率更高。更不用说,您可以根据需要在自己的SymDict中添加或删除值,并且逆字典将始终保持有效-如果仅对整个字典进行一次反向操作,则情况并非如此。

I think the best way to do this is to define a class. Here is an implementation of a “symmetric dictionary”:

class SymDict:
    def __init__(self):
        self.aToB = {}
        self.bToA = {}

    def assocAB(self, a, b):
        # Stores and returns a tuple (a,b) of overwritten bindings
        currB = None
        if a in self.aToB: currB = self.bToA[a]
        currA = None
        if b in self.bToA: currA = self.aToB[b]

        self.aToB[a] = b
        self.bToA[b] = a
        return (currA, currB)

    def lookupA(self, a):
        if a in self.aToB:
            return self.aToB[a]
        return None

    def lookupB(self, b):
        if b in self.bToA:
            return self.bToA[b]
        return None

Deletion and iteration methods are easy enough to implement if they’re needed.

This implementation is way more efficient than inverting an entire dictionary (which seems to be the most popular solution on this page). Not to mention, you can add or remove values from your SymDict as much as you want, and your inverse-dictionary will always stay valid — this isn’t true if you simply reverse the entire dictionary once.


回答 13

这处理非唯一值,并保留了唯一情况的大部分外观。

inv_map = {v:[k for k in my_map if my_map[k] == v] for v in my_map.itervalues()}

对于Python 3.x,请替换itervaluesvalues

This handles non-unique values and retains much of the look of the unique case.

inv_map = {v:[k for k in my_map if my_map[k] == v] for v in my_map.itervalues()}

For Python 3.x, replace itervalues with values.


回答 14

函数对于类型列表的值是对称的;执行reverse_dict(reverse_dict(dictionary))时,元组被覆盖到列表中

def reverse_dict(dictionary):
    reverse_dict = {}
    for key, value in dictionary.iteritems():
        if not isinstance(value, (list, tuple)):
            value = [value]
        for val in value:
            reverse_dict[val] = reverse_dict.get(val, [])
            reverse_dict[val].append(key)
    for key, value in reverse_dict.iteritems():
        if len(value) == 1:
            reverse_dict[key] = value[0]
    return reverse_dict

Function is symmetric for values of type list; Tuples are coverted to lists when performing reverse_dict(reverse_dict(dictionary))

def reverse_dict(dictionary):
    reverse_dict = {}
    for key, value in dictionary.iteritems():
        if not isinstance(value, (list, tuple)):
            value = [value]
        for val in value:
            reverse_dict[val] = reverse_dict.get(val, [])
            reverse_dict[val].append(key)
    for key, value in reverse_dict.iteritems():
        if len(value) == 1:
            reverse_dict[key] = value[0]
    return reverse_dict

回答 15

由于字典在字典中需要一个与值不同的唯一键,因此我们必须将反转的值附加到要包含在新的特定键中的排序列表中。

def r_maping(dictionary):
    List_z=[]
    Map= {}
    for z, x in dictionary.iteritems(): #iterate through the keys and values
        Map.setdefault(x,List_z).append(z) #Setdefault is the same as dict[key]=default."The method returns the key value available in the dictionary and if given key is not available then it will return provided default value. Afterward, we will append into the default list our new values for the specific key.
    return Map

Since dictionaries require one unique key within the dictionary unlike values, we have to append the reversed values into a list of sort to be included within the new specific keys.

def r_maping(dictionary):
    List_z=[]
    Map= {}
    for z, x in dictionary.iteritems(): #iterate through the keys and values
        Map.setdefault(x,List_z).append(z) #Setdefault is the same as dict[key]=default."The method returns the key value available in the dictionary and if given key is not available then it will return provided default value. Afterward, we will append into the default list our new values for the specific key.
    return Map

回答 16

非双射映射的快速功能解决方案(值不是唯一的):

from itertools import imap, groupby

def fst(s):
    return s[0]

def snd(s):
    return s[1]

def inverseDict(d):
    """
    input d: a -> b
    output : b -> set(a)
    """
    return {
        v : set(imap(fst, kv_iter))
        for (v, kv_iter) in groupby(
            sorted(d.iteritems(),
                   key=snd),
            key=snd
        )
    }

从理论上讲,这应该比在命令式解决方案中一个接一个地添加到集合(或追加到列表中)要快。

不幸的是,值必须是可排序的,groupby要求排序。

Fast functional solution for non-bijective maps (values not unique):

from itertools import imap, groupby

def fst(s):
    return s[0]

def snd(s):
    return s[1]

def inverseDict(d):
    """
    input d: a -> b
    output : b -> set(a)
    """
    return {
        v : set(imap(fst, kv_iter))
        for (v, kv_iter) in groupby(
            sorted(d.iteritems(),
                   key=snd),
            key=snd
        )
    }

In theory this should be faster than adding to the set (or appending to the list) one by one like in the imperative solution.

Unfortunately the values have to be sortable, the sorting is required by groupby.


回答 17

尝试使用python 2.7 / 3.x

inv_map={};
for i in my_map:
    inv_map[my_map[i]]=i    
print inv_map

Try this for python 2.7/3.x

inv_map={};
for i in my_map:
    inv_map[my_map[i]]=i    
print inv_map

回答 18

我会在python 2中那样做。

inv_map = {my_map[x] : x for x in my_map}

I would do it that way in python 2.

inv_map = {my_map[x] : x for x in my_map}

回答 19

def invertDictionary(d):
    myDict = {}
  for i in d:
     value = d.get(i)
     myDict.setdefault(value,[]).append(i)   
 return myDict
 print invertDictionary({'a':1, 'b':2, 'c':3 , 'd' : 1})

这将提供以下输出:{1:[‘a’,’d’],2:[‘b’],3:[‘c’]}

def invertDictionary(d):
    myDict = {}
  for i in d:
     value = d.get(i)
     myDict.setdefault(value,[]).append(i)   
 return myDict
 print invertDictionary({'a':1, 'b':2, 'c':3 , 'd' : 1})

This will provide output as : {1: [‘a’, ‘d’], 2: [‘b’], 3: [‘c’]}


回答 20

  def reverse_dictionary(input_dict):
      out = {}
      for v in input_dict.values():  
          for value in v:
              if value not in out:
                  out[value.lower()] = []

      for i in input_dict:
          for j in out:
              if j in map (lambda x : x.lower(),input_dict[i]):
                  out[j].append(i.lower())
                  out[j].sort()
      return out

这段代码是这样的:

r = reverse_dictionary({'Accurate': ['exact', 'precise'], 'exact': ['precise'], 'astute': ['Smart', 'clever'], 'smart': ['clever', 'bright', 'talented']})

print(r)

{'precise': ['accurate', 'exact'], 'clever': ['astute', 'smart'], 'talented': ['smart'], 'bright': ['smart'], 'exact': ['accurate'], 'smart': ['astute']}
  def reverse_dictionary(input_dict):
      out = {}
      for v in input_dict.values():  
          for value in v:
              if value not in out:
                  out[value.lower()] = []

      for i in input_dict:
          for j in out:
              if j in map (lambda x : x.lower(),input_dict[i]):
                  out[j].append(i.lower())
                  out[j].sort()
      return out

this code do like this:

r = reverse_dictionary({'Accurate': ['exact', 'precise'], 'exact': ['precise'], 'astute': ['Smart', 'clever'], 'smart': ['clever', 'bright', 'talented']})

print(r)

{'precise': ['accurate', 'exact'], 'clever': ['astute', 'smart'], 'talented': ['smart'], 'bright': ['smart'], 'exact': ['accurate'], 'smart': ['astute']}

回答 21

没有什么完全不同,只是Cookbook重写了一些食谱。还通过保留setdefault方法进行了优化,而不是每次通过实例进行优化:

def inverse(mapping):
    '''
    A function to inverse mapping, collecting keys with simillar values
    in list. Careful to retain original type and to be fast.
    >> d = dict(a=1, b=2, c=1, d=3, e=2, f=1, g=5, h=2)
    >> inverse(d)
    {1: ['f', 'c', 'a'], 2: ['h', 'b', 'e'], 3: ['d'], 5: ['g']}
    '''
    res = {}
    setdef = res.setdefault
    for key, value in mapping.items():
        setdef(value, []).append(key)
    return res if mapping.__class__==dict else mapping.__class__(res)

设计下CPython的3.X中运行,为2.X替换mapping.items()mapping.iteritems()

在我的机器上,运行速度比这里的其他示例更快

Not something completely different, just a bit rewritten recipe from Cookbook. It’s futhermore optimized by retaining setdefault method, instead of each time getting it through the instance:

def inverse(mapping):
    '''
    A function to inverse mapping, collecting keys with simillar values
    in list. Careful to retain original type and to be fast.
    >> d = dict(a=1, b=2, c=1, d=3, e=2, f=1, g=5, h=2)
    >> inverse(d)
    {1: ['f', 'c', 'a'], 2: ['h', 'b', 'e'], 3: ['d'], 5: ['g']}
    '''
    res = {}
    setdef = res.setdefault
    for key, value in mapping.items():
        setdef(value, []).append(key)
    return res if mapping.__class__==dict else mapping.__class__(res)

Designed to be run under CPython 3.x, for 2.x replace mapping.items() with mapping.iteritems()

On my machine runs a bit faster, than other examples here


回答 22

我在循环“ for”和方法“ .get()”的帮助下编写了此代码,并将字典的名称“ map”更改为“ map1”,因为“ map”是一个函数。

def dict_invert(map1):
    inv_map = {} # new dictionary
    for key in map1.keys():
        inv_map[map1.get(key)] = key
    return inv_map

I wrote this with the help of cycle ‘for’ and method ‘.get()’ and I changed the name ‘map’ of the dictionary to ‘map1’ because ‘map’ is a function.

def dict_invert(map1):
    inv_map = {} # new dictionary
    for key in map1.keys():
        inv_map[map1.get(key)] = key
    return inv_map

回答 23

如果值不是唯一的,并且可能是哈希(一维):

for k, v in myDict.items():
    if len(v) > 1:
        for item in v:
            invDict[item] = invDict.get(item, [])
            invDict[item].append(k)
    else:
        invDict[v] = invDict.get(v, [])
        invDict[v].append(k)

如果需要更深层次地进行递归,则只需一个维度即可:

def digList(lst):
    temp = []
    for item in lst:
        if type(item) is list:
            temp.append(digList(item))
        else:
            temp.append(item)
    return set(temp)

for k, v in myDict.items():
    if type(v) is list:
        items = digList(v)
        for item in items:
            invDict[item] = invDict.get(item, [])
            invDict[item].append(k)
    else:
        invDict[v] = invDict.get(v, [])
        invDict[v].append(k)

If values aren’t unique AND may be a hash (one dimension):

for k, v in myDict.items():
    if len(v) > 1:
        for item in v:
            invDict[item] = invDict.get(item, [])
            invDict[item].append(k)
    else:
        invDict[v] = invDict.get(v, [])
        invDict[v].append(k)

And with a recursion if you need to dig deeper then just one dimension:

def digList(lst):
    temp = []
    for item in lst:
        if type(item) is list:
            temp.append(digList(item))
        else:
            temp.append(item)
    return set(temp)

for k, v in myDict.items():
    if type(v) is list:
        items = digList(v)
        for item in items:
            invDict[item] = invDict.get(item, [])
            invDict[item].append(k)
    else:
        invDict[v] = invDict.get(v, [])
        invDict[v].append(k)

将字典列表转换为Pandas DataFrame

问题:将字典列表转换为Pandas DataFrame

我有这样的词典列表:

[{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]

我想把它变成这样的大熊猫DataFrame

      month  points  points_h1  time  year
0       NaN      50        NaN  5:00  2010
1  february      25        NaN  6:00   NaN
2   january      90        NaN  9:00   NaN
3      june     NaN         20   NaN   NaN

注意:列的顺序无关紧要。

如何将字典列表转换为如上所述的pandas DataFrame?

I have a list of dictionaries like this:

[{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]

And I want to turn this into a pandas DataFrame like this:

      month  points  points_h1  time  year
0       NaN      50        NaN  5:00  2010
1  february      25        NaN  6:00   NaN
2   january      90        NaN  9:00   NaN
3      june     NaN         20   NaN   NaN

Note: Order of the columns does not matter.

How can I turn the list of dictionaries into a pandas DataFrame as shown above?


回答 0

假设d您的字典列表很简单:

pd.DataFrame(d)

Supposing d is your list of dicts, simply:

pd.DataFrame(d)

回答 1

如何将字典列表转换为Pandas DataFrame?

其他答案是正确的,但是就这些方法的优点和局限性而言,并没有太多解释。这篇文章的目的是展示在不同情况下这些方法的示例,讨论何时使用(何时不使用),并提出替代方案。


DataFrame()DataFrame.from_records().from_dict()

根据数据的结构和格式,在某些情况下,这三种方法要么全部起作用,要么某些方法比其他方法更好,或者有些根本不起作用。

考虑一个非常人为的例子。

np.random.seed(0)
data = pd.DataFrame(
    np.random.choice(10, (3, 4)), columns=list('ABCD')).to_dict('r')

print(data)
[{'A': 5, 'B': 0, 'C': 3, 'D': 3},
 {'A': 7, 'B': 9, 'C': 3, 'D': 5},
 {'A': 2, 'B': 4, 'C': 7, 'D': 6}]

该列表由“记录”组成,其中包含每个键。这是您可能遇到的最简单的情况。

# The following methods all produce the same output.
pd.DataFrame(data)
pd.DataFrame.from_dict(data)
pd.DataFrame.from_records(data)

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

词典定位词:orient='index'/'columns'

在继续之前,重要的是要区分不同类型的字典方向和熊猫的支持。有两种主要类型:“列”和“索引”。

orient='columns'
方向为“列”的字典的键将与等效DataFrame中的列相对应。

例如,data上面是在“列”方向上。

data_c = [
 {'A': 5, 'B': 0, 'C': 3, 'D': 3},
 {'A': 7, 'B': 9, 'C': 3, 'D': 5},
 {'A': 2, 'B': 4, 'C': 7, 'D': 6}]

pd.DataFrame.from_dict(data_c, orient='columns')

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

注意:如果使用pd.DataFrame.from_records,则假定方向为“列”(否则无法指定),并且将相应地加载字典。

orient='index'
通过这种定向,键被假定为对应于索引值。这种数据最适合pd.DataFrame.from_dict

data_i ={
 0: {'A': 5, 'B': 0, 'C': 3, 'D': 3},
 1: {'A': 7, 'B': 9, 'C': 3, 'D': 5},
 2: {'A': 2, 'B': 4, 'C': 7, 'D': 6}}

pd.DataFrame.from_dict(data_i, orient='index')

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

在OP中不考虑这种情况,但仍然有用。

设置自定义索引

如果需要在结果DataFrame上使用自定义索引,则可以使用index=...参数进行设置。

pd.DataFrame(data, index=['a', 'b', 'c'])
# pd.DataFrame.from_records(data, index=['a', 'b', 'c'])

   A  B  C  D
a  5  0  3  3
b  7  9  3  5
c  2  4  7  6

不支持此功能pd.DataFrame.from_dict

处理缺少的键/列

当处理缺少键/列值的字典时,所有方法都是开箱即用的。例如,

data2 = [
     {'A': 5, 'C': 3, 'D': 3},
     {'A': 7, 'B': 9, 'F': 5},
     {'B': 4, 'C': 7, 'E': 6}]

# The methods below all produce the same output.
pd.DataFrame(data2)
pd.DataFrame.from_dict(data2)
pd.DataFrame.from_records(data2)

     A    B    C    D    E    F
0  5.0  NaN  3.0  3.0  NaN  NaN
1  7.0  9.0  NaN  NaN  NaN  5.0
2  NaN  4.0  7.0  NaN  6.0  NaN

读取列子集

“如果我不想在每一列中阅读该怎么办”?您可以使用columns=...参数轻松指定。

例如,从data2上面的示例字典中,如果您只想读取列“ A”,“ D”和“ F”,则可以通过传递一个列表来做到这一点:

pd.DataFrame(data2, columns=['A', 'D', 'F'])
# pd.DataFrame.from_records(data2, columns=['A', 'D', 'F'])

     A    D    F
0  5.0  3.0  NaN
1  7.0  NaN  5.0
2  NaN  NaN  NaN

pd.DataFrame.from_dict默认方向的“列” 不支持此功能。

pd.DataFrame.from_dict(data2, orient='columns', columns=['A', 'B'])

ValueError: cannot use columns parameter with orient='columns'

读取行的子集

这些方法都不直接支持。您将必须遍历数据,并在进行迭代时就地执行反向删除。例如,为了仅提取0 和2 的行从data2上述,可以使用:

rows_to_select = {0, 2}
for i in reversed(range(len(data2))):
    if i not in rows_to_select:
        del data2[i]

pd.DataFrame(data2)
# pd.DataFrame.from_dict(data2)
# pd.DataFrame.from_records(data2)

     A    B  C    D    E
0  5.0  NaN  3  3.0  NaN
1  NaN  4.0  7  NaN  6.0

灵丹妙药:json_normalize用于嵌套数据

上面列出的方法的一种强大而强大的替代方法是该json_normalize函数可用于词典列表(记录),此外还可以处理嵌套词典。

pd.io.json.json_normalize(data)

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

pd.io.json.json_normalize(data2)

     A    B  C    D    E
0  5.0  NaN  3  3.0  NaN
1  NaN  4.0  7  NaN  6.0

同样,请记住,传递给的数据json_normalize必须采用字典列表(记录)格式。

如前所述,json_normalize还可以处理嵌套字典。这是从文档中获取的示例。

data_nested = [
  {'counties': [{'name': 'Dade', 'population': 12345},
                {'name': 'Broward', 'population': 40000},
                {'name': 'Palm Beach', 'population': 60000}],
   'info': {'governor': 'Rick Scott'},
   'shortname': 'FL',
   'state': 'Florida'},
  {'counties': [{'name': 'Summit', 'population': 1234},
                {'name': 'Cuyahoga', 'population': 1337}],
   'info': {'governor': 'John Kasich'},
   'shortname': 'OH',
   'state': 'Ohio'}
]

pd.io.json.json_normalize(data_nested, 
                          record_path='counties', 
                          meta=['state', 'shortname', ['info', 'governor']])

         name  population    state shortname info.governor
0        Dade       12345  Florida        FL    Rick Scott
1     Broward       40000  Florida        FL    Rick Scott
2  Palm Beach       60000  Florida        FL    Rick Scott
3      Summit        1234     Ohio        OH   John Kasich
4    Cuyahoga        1337     Ohio        OH   John Kasich

有关metarecord_path参数的更多信息,请查阅文档。


总结

这是上面讨论的所有方法的表格,以及受支持的功能/特性。

在此处输入图片说明

*使用orient='columns'并转置以获得与相同的效果orient='index'

How do I convert a list of dictionaries to a pandas DataFrame?

The other answers are correct, but not much has been explained in terms of advantages and limitations of these methods. The aim of this post will be to show examples of these methods under different situations, discuss when to use (and when not to use), and suggest alternatives.


DataFrame(), DataFrame.from_records(), and .from_dict()

Depending on the structure and format of your data, there are situations where either all three methods work, or some work better than others, or some don’t work at all.

Consider a very contrived example.

np.random.seed(0)
data = pd.DataFrame(
    np.random.choice(10, (3, 4)), columns=list('ABCD')).to_dict('r')

print(data)
[{'A': 5, 'B': 0, 'C': 3, 'D': 3},
 {'A': 7, 'B': 9, 'C': 3, 'D': 5},
 {'A': 2, 'B': 4, 'C': 7, 'D': 6}]

This list consists of “records” with every keys present. This is the simplest case you could encounter.

# The following methods all produce the same output.
pd.DataFrame(data)
pd.DataFrame.from_dict(data)
pd.DataFrame.from_records(data)

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

Word on Dictionary Orientations: orient='index'/'columns'

Before continuing, it is important to make the distinction between the different types of dictionary orientations, and support with pandas. There are two primary types: “columns”, and “index”.

orient='columns'
Dictionaries with the “columns” orientation will have their keys correspond to columns in the equivalent DataFrame.

For example, data above is in the “columns” orient.

data_c = [
 {'A': 5, 'B': 0, 'C': 3, 'D': 3},
 {'A': 7, 'B': 9, 'C': 3, 'D': 5},
 {'A': 2, 'B': 4, 'C': 7, 'D': 6}]

pd.DataFrame.from_dict(data_c, orient='columns')

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

Note: If you are using pd.DataFrame.from_records, the orientation is assumed to be “columns” (you cannot specify otherwise), and the dictionaries will be loaded accordingly.

orient='index'
With this orient, keys are assumed to correspond to index values. This kind of data is best suited for pd.DataFrame.from_dict.

data_i ={
 0: {'A': 5, 'B': 0, 'C': 3, 'D': 3},
 1: {'A': 7, 'B': 9, 'C': 3, 'D': 5},
 2: {'A': 2, 'B': 4, 'C': 7, 'D': 6}}

pd.DataFrame.from_dict(data_i, orient='index')

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

This case is not considered in the OP, but is still useful to know.

Setting Custom Index

If you need a custom index on the resultant DataFrame, you can set it using the index=... argument.

pd.DataFrame(data, index=['a', 'b', 'c'])
# pd.DataFrame.from_records(data, index=['a', 'b', 'c'])

   A  B  C  D
a  5  0  3  3
b  7  9  3  5
c  2  4  7  6

This is not supported by pd.DataFrame.from_dict.

Dealing with Missing Keys/Columns

All methods work out-of-the-box when handling dictionaries with missing keys/column values. For example,

data2 = [
     {'A': 5, 'C': 3, 'D': 3},
     {'A': 7, 'B': 9, 'F': 5},
     {'B': 4, 'C': 7, 'E': 6}]

# The methods below all produce the same output.
pd.DataFrame(data2)
pd.DataFrame.from_dict(data2)
pd.DataFrame.from_records(data2)

     A    B    C    D    E    F
0  5.0  NaN  3.0  3.0  NaN  NaN
1  7.0  9.0  NaN  NaN  NaN  5.0
2  NaN  4.0  7.0  NaN  6.0  NaN

Reading Subset of Columns

“What if I don’t want to read in every single column”? You can easily specify this using the columns=... parameter.

For example, from the example dictionary of data2 above, if you wanted to read only columns “A’, ‘D’, and ‘F’, you can do so by passing a list:

pd.DataFrame(data2, columns=['A', 'D', 'F'])
# pd.DataFrame.from_records(data2, columns=['A', 'D', 'F'])

     A    D    F
0  5.0  3.0  NaN
1  7.0  NaN  5.0
2  NaN  NaN  NaN

This is not supported by pd.DataFrame.from_dict with the default orient “columns”.

pd.DataFrame.from_dict(data2, orient='columns', columns=['A', 'B'])

ValueError: cannot use columns parameter with orient='columns'

Reading Subset of Rows

Not supported by any of these methods directly. You will have to iterate over your data and perform a reverse delete in-place as you iterate. For example, to extract only the 0th and 2nd rows from data2 above, you can use:

rows_to_select = {0, 2}
for i in reversed(range(len(data2))):
    if i not in rows_to_select:
        del data2[i]

pd.DataFrame(data2)
# pd.DataFrame.from_dict(data2)
# pd.DataFrame.from_records(data2)

     A    B  C    D    E
0  5.0  NaN  3  3.0  NaN
1  NaN  4.0  7  NaN  6.0

The Panacea: json_normalize for Nested Data

A strong, robust alternative to the methods outlined above is the json_normalize function which works with lists of dictionaries (records), and in addition can also handle nested dictionaries.

pd.io.json.json_normalize(data)

   A  B  C  D
0  5  0  3  3
1  7  9  3  5
2  2  4  7  6

pd.io.json.json_normalize(data2)

     A    B  C    D    E
0  5.0  NaN  3  3.0  NaN
1  NaN  4.0  7  NaN  6.0

Again, keep in mind that the data passed to json_normalize needs to be in the list-of-dictionaries (records) format.

As mentioned, json_normalize can also handle nested dictionaries. Here’s an example taken from the documentation.

data_nested = [
  {'counties': [{'name': 'Dade', 'population': 12345},
                {'name': 'Broward', 'population': 40000},
                {'name': 'Palm Beach', 'population': 60000}],
   'info': {'governor': 'Rick Scott'},
   'shortname': 'FL',
   'state': 'Florida'},
  {'counties': [{'name': 'Summit', 'population': 1234},
                {'name': 'Cuyahoga', 'population': 1337}],
   'info': {'governor': 'John Kasich'},
   'shortname': 'OH',
   'state': 'Ohio'}
]

pd.io.json.json_normalize(data_nested, 
                          record_path='counties', 
                          meta=['state', 'shortname', ['info', 'governor']])

         name  population    state shortname info.governor
0        Dade       12345  Florida        FL    Rick Scott
1     Broward       40000  Florida        FL    Rick Scott
2  Palm Beach       60000  Florida        FL    Rick Scott
3      Summit        1234     Ohio        OH   John Kasich
4    Cuyahoga        1337     Ohio        OH   John Kasich

For more information on the meta and record_path arguments, check out the documentation.


Summarising

Here’s a table of all the methods discussed above, along with supported features/functionality.

enter image description here

* Use orient='columns' and then transpose to get the same effect as orient='index'.


回答 2

在熊猫16.2中,我必须做一些pd.DataFrame.from_records(d)才能使它起作用。

In pandas 16.2, I had to do pd.DataFrame.from_records(d) to get this to work.


回答 3

您也可以pd.DataFrame.from_dict(d)用作:

In [8]: d = [{'points': 50, 'time': '5:00', 'year': 2010}, 
   ...: {'points': 25, 'time': '6:00', 'month': "february"}, 
   ...: {'points':90, 'time': '9:00', 'month': 'january'}, 
   ...: {'points_h1':20, 'month': 'june'}]

In [12]: pd.DataFrame.from_dict(d)
Out[12]: 
      month  points  points_h1  time    year
0       NaN    50.0        NaN  5:00  2010.0
1  february    25.0        NaN  6:00     NaN
2   january    90.0        NaN  9:00     NaN
3      june     NaN       20.0   NaN     NaN

You can also use pd.DataFrame.from_dict(d) as :

In [8]: d = [{'points': 50, 'time': '5:00', 'year': 2010}, 
   ...: {'points': 25, 'time': '6:00', 'month': "february"}, 
   ...: {'points':90, 'time': '9:00', 'month': 'january'}, 
   ...: {'points_h1':20, 'month': 'june'}]

In [12]: pd.DataFrame.from_dict(d)
Out[12]: 
      month  points  points_h1  time    year
0       NaN    50.0        NaN  5:00  2010.0
1  february    25.0        NaN  6:00     NaN
2   january    90.0        NaN  9:00     NaN
3      june     NaN       20.0   NaN     NaN

回答 4

我知道会有几个人遇到这个问题,但这里没有任何帮助。我发现最简单的方法是这样的:

dict_count = len(dict_list)
df = pd.DataFrame(dict_list[0], index=[0])
for i in range(1,dict_count-1):
    df = df.append(dict_list[i], ignore_index=True)

希望这对某人有帮助!

I know a few people will come across this and find nothing here helps. The easiest way I have found to do it is like this:

dict_count = len(dict_list)
df = pd.DataFrame(dict_list[0], index=[0])
for i in range(1,dict_count-1):
    df = df.append(dict_list[i], ignore_index=True)

Hope this helps someone!


回答 5

list=[{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]

和简单的电话:

pd=DataFrame.from_dict(list, orient='columns', dtype=None)

print(pd)
list=[{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]

and simple call:

pd=DataFrame.from_dict(list, orient='columns', dtype=None)

print(pd)

回答 6

Pyhton3: 前面列出的大多数解决方案都可以使用。但是,在某些情况下,不需要数据帧的row_number,并且必须单独写入每一行(记录)。

在这种情况下,以下方法很有用。

import csv

my file= 'C:\Users\John\Desktop\export_dataframe.csv'

records_to_save = data2 #used as in the thread. 


colnames = list[records_to_save[0].keys()] 
# remember colnames is a list of all keys. All values are written corresponding
# to the keys and "None" is specified in case of missing value 

with open(myfile, 'w', newline="",encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(colnames)
    for d in records_to_save:
        writer.writerow([d.get(r, "None") for r in colnames])

Pyhton3: Most of the solutions listed previously work. However, there are instances when row_number of the dataframe is not required and the each row (record) has to be written individually.

The following method is useful in that case.

import csv

my file= 'C:\Users\John\Desktop\export_dataframe.csv'

records_to_save = data2 #used as in the thread. 


colnames = list[records_to_save[0].keys()] 
# remember colnames is a list of all keys. All values are written corresponding
# to the keys and "None" is specified in case of missing value 

with open(myfile, 'w', newline="",encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(colnames)
    for d in records_to_save:
        writer.writerow([d.get(r, "None") for r in colnames])

回答 7

要将字典列表转换为pandas DataFrame,可以使用“ append”:

我们有一个叫做字典dic和DIC有30个列表项(list1list2,… list30

  1. 步骤1:定义一个变量保持你的结果(例如:total_df
  2. 第二步:初始化total_dflist1
  3. 第三步:使用“ for循环”将所有列表附加到 total_df
total_df=list1
nums=Series(np.arange(start=2, stop=31))
for num in nums:
    total_df=total_df.append(dic['list'+str(num)])

For converting a list of dictionaries to a pandas DataFrame, you can use “append”:

We have a dictionary called dic and dic has 30 list items (list1, list2,…, list30)

  1. step1: define a variable for keeping your result (ex: total_df)
  2. step2: initialize total_df with list1
  3. step3: use “for loop” for append all lists to total_df
total_df=list1
nums=Series(np.arange(start=2, stop=31))
for num in nums:
    total_df=total_df.append(dic['list'+str(num)])

为什么用dict.get(key)而不是dict [key]?

问题:为什么用dict.get(key)而不是dict [key]?

今天,我遇到了该dict方法get,给定字典中的键,该方法将返回关联的值。

此功能用于什么目的?如果我想找到与字典中的键相关联的值,我可以这样做dict[key],并且它返回相同的内容:

dictionary = {"Name": "Harry", "Age": 17}
dictionary["Name"]
dictionary.get("Name")

Today, I came across the dict method get which, given a key in the dictionary, returns the associated value.

For what purpose is this function useful? If I wanted to find a value associated with a key in a dictionary, I can just do dict[key], and it returns the same thing:

dictionary = {"Name": "Harry", "Age": 17}
dictionary["Name"]
dictionary.get("Name")

回答 0

如果密钥丢失,它允许您提供默认值:

dictionary.get("bogus", default_value)

返回default_value(无论您选择的是什么),而

dictionary["bogus"]

会提出一个KeyError

如果省略,default_value则为None,这样

dictionary.get("bogus")  # <-- No default specified -- defaults to None

返回None就像

dictionary.get("bogus", None)

将。

It allows you to provide a default value if the key is missing:

dictionary.get("bogus", default_value)

returns default_value (whatever you choose it to be), whereas

dictionary["bogus"]

would raise a KeyError.

If omitted, default_value is None, such that

dictionary.get("bogus")  # <-- No default specified -- defaults to None

returns None just like

dictionary.get("bogus", None)

would.


回答 1

什么dict.get()方法?

如前所述,该get方法包含一个附加参数,指示缺少的值。从文档中

get(key[, default])

如果key在字典中,则返回key的值,否则返回默认值。如果未提供default,则默认为None,因此此方法永远不会引发KeyError

一个例子可以是

>>> d = {1:2,2:3}
>>> d[1]
2
>>> d.get(1)
2
>>> d.get(3)
>>> repr(d.get(3))
'None'
>>> d.get(3,1)
1

哪里有速度改进?

如前所述这里

似乎所有这三种方法现在都表现出相似的性能(彼此之间约占10%),或多或少地与单词列表的属性无关。

以前的get速度要慢得多,但是现在速度几乎可以与返回默认值的其他优点相媲美。但是要清除所有查询,我们可以在相当大的列表上进行测试(请注意,该测试仅包括查找所有有效键)

def getway(d):
    for i in range(100):
        s = d.get(i)

def lookup(d):
    for i in range(100):
        s = d[i]

现在使用以下命令计时这两个功能 timeit

>>> import timeit
>>> print(timeit.timeit("getway({i:i for i in range(100)})","from __main__ import getway"))
20.2124660015
>>> print(timeit.timeit("lookup({i:i for i in range(100)})","from __main__ import lookup"))
16.16223979

如我们所见,由于没有函数查找,因此查找比获取更快。可以看出来dis

>>> def lookup(d,val):
...     return d[val]
... 
>>> def getway(d,val):
...     return d.get(val)
... 
>>> dis.dis(getway)
  2           0 LOAD_FAST                0 (d)
              3 LOAD_ATTR                0 (get)
              6 LOAD_FAST                1 (val)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE        
>>> dis.dis(lookup)
  2           0 LOAD_FAST                0 (d)
              3 LOAD_FAST                1 (val)
              6 BINARY_SUBSCR       
              7 RETURN_VALUE  

在哪里有用?

每当您要查找字典时都想提供默认值时,它将很有用。这减少了

 if key in dic:
      val = dic[key]
 else:
      val = def_val

一行 val = dic.get(key,def_val)

在哪里没有用?

每当您想返回 KeyError说明该特定键不可用时。返回默认值还会带来一个风险,即某个默认值也可能是键!

是否有可能具有get类似功能dict['key']

是! 我们需要实施__missing__在dict子类中实现。

一个示例程序可以是

class MyDict(dict):
    def __missing__(self, key):
        return None

一个小示范可以

>>> my_d = MyDict({1:2,2:3})
>>> my_d[1]
2
>>> my_d[3]
>>> repr(my_d[3])
'None'

What is the dict.get() method?

As already mentioned the get method contains an additional parameter which indicates the missing value. From the documentation

get(key[, default])

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.

An example can be

>>> d = {1:2,2:3}
>>> d[1]
2
>>> d.get(1)
2
>>> d.get(3)
>>> repr(d.get(3))
'None'
>>> d.get(3,1)
1

Are there speed improvements anywhere?

As mentioned here,

It seems that all three approaches now exhibit similar performance (within about 10% of each other), more or less independent of the properties of the list of words.

Earlier get was considerably slower, However now the speed is almost comparable along with the additional advantage of returning the default value. But to clear all our queries, we can test on a fairly large list (Note that the test includes looking up all the valid keys only)

def getway(d):
    for i in range(100):
        s = d.get(i)

def lookup(d):
    for i in range(100):
        s = d[i]

Now timing these two functions using timeit

>>> import timeit
>>> print(timeit.timeit("getway({i:i for i in range(100)})","from __main__ import getway"))
20.2124660015
>>> print(timeit.timeit("lookup({i:i for i in range(100)})","from __main__ import lookup"))
16.16223979

As we can see the lookup is faster than the get as there is no function lookup. This can be seen through dis

>>> def lookup(d,val):
...     return d[val]
... 
>>> def getway(d,val):
...     return d.get(val)
... 
>>> dis.dis(getway)
  2           0 LOAD_FAST                0 (d)
              3 LOAD_ATTR                0 (get)
              6 LOAD_FAST                1 (val)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE        
>>> dis.dis(lookup)
  2           0 LOAD_FAST                0 (d)
              3 LOAD_FAST                1 (val)
              6 BINARY_SUBSCR       
              7 RETURN_VALUE  

Where will it be useful?

It will be useful whenever you want to provide a default value whenever you are looking up a dictionary. This reduces

 if key in dic:
      val = dic[key]
 else:
      val = def_val

To a single line, val = dic.get(key,def_val)

Where will it be NOT useful?

Whenever you want to return a KeyError stating that the particular key is not available. Returning a default value also carries the risk that a particular default value may be a key too!

Is it possible to have get like feature in dict['key']?

Yes! We need to implement the __missing__ in a dict subclass.

A sample program can be

class MyDict(dict):
    def __missing__(self, key):
        return None

A small demonstration can be

>>> my_d = MyDict({1:2,2:3})
>>> my_d[1]
2
>>> my_d[3]
>>> repr(my_d[3])
'None'

回答 2

get采用第二个可选值。如果字典中不存在指定的键,则将返回此值。

dictionary = {"Name": "Harry", "Age": 17}
dictionary.get('Year', 'No available data')
>> 'No available data'

如果不提供第二个参数,None将返回。

如果您按这种方式使用索引dictionary['Year'],则不存在的键将引发KeyError

get takes a second optional value. If the specified key does not exist in your dictionary, then this value will be returned.

dictionary = {"Name": "Harry", "Age": 17}
dictionary.get('Year', 'No available data')
>> 'No available data'

If you do not give the second parameter, None will be returned.

If you use indexing as in dictionary['Year'], nonexistent keys will raise KeyError.


回答 3

我将举一个使用python抓取Web数据的实际示例,很多时候,您将获得没有值的键,在这些情况下,如果您使用dictionary [‘key’]会出错,而dictionary.get(’key ‘,’return_otherwise’)没问题。

同样,如果您尝试从列表中捕获单个值,我将使用”.join(list)而不是list [0]。

希望能帮助到你。

[编辑]这是一个实际示例:

假设您正在调用一个API,该API返回您需要解析的JOSN文件。第一个JSON如下所示:

{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","submitdate_ts":1318794805,"users_id":"2674360","project_id":"1250499"}}

第二个JOSN是这样的:

{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","users_id":"2674360","project_id":"1250499"}}

请注意,第二个JSON缺少“ submitdate_ts”键,这在任何数据结构中都是很正常的。

因此,当您尝试循环访问该键的值时,可以使用以下命令调用它:

for item in API_call:
    submitdate_ts = item["bids"]["submitdate_ts"]

您可以,但是它将给您第二条JSON行的回溯错误,因为密钥根本不存在。

适当的编码方式如下:

for item in API_call:
    submitdate_ts = item.get("bids", {'x': None}).get("submitdate_ts")

{‘x’:None}可以避免第二级出错。当然,如果您执行抓取操作,则可以在代码中内置更多的容错功能。就像首先指定一个if条件

I will give a practical example in scraping web data using python, a lot of the times you will get keys with no values, in those cases you will get errors if you use dictionary[‘key’], whereas dictionary.get(‘key’, ‘return_otherwise’) has no problems.

Similarly, I would use ”.join(list) as opposed to list[0] if you try to capture a single value from a list.

hope it helps.

[Edit] Here is a practical example:

Say, you are calling an API, which returns a JOSN file you need to parse. The first JSON looks like following:

{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","submitdate_ts":1318794805,"users_id":"2674360","project_id":"1250499"}}

The second JOSN is like this:

{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","users_id":"2674360","project_id":"1250499"}}

Note that the second JSON is missing the “submitdate_ts” key, which is pretty normal in any data structure.

So when you try to access the value of that key in a loop, can you call it with the following:

for item in API_call:
    submitdate_ts = item["bids"]["submitdate_ts"]

You could, but it will give you a traceback error for the second JSON line, because the key simply doesn’t exist.

The appropriate way of coding this, could be the following:

for item in API_call:
    submitdate_ts = item.get("bids", {'x': None}).get("submitdate_ts")

{‘x’: None} is there to avoid the second level getting an error. Of course you can build in more fault tolerance into the code if you are doing scraping. Like first specifying a if condition


回答 4

目的是如果找不到密钥,则可以提供默认值,这非常有用

dictionary.get("Name",'harry')

The purpose is that you can give a default value if the key is not found, which is very useful

dictionary.get("Name",'harry')

回答 5

此功能用于什么目的?

一种特殊用法是用字典计数。假设您要计算给定列表中每个元素的出现次数。这样做的常见方法是制作一个字典,其中键是元素,值是出现的次数。

fruits = ['apple', 'banana', 'peach', 'apple', 'pear']
d = {}
for fruit in fruits:
    if fruit not in d:
        d[fruit] = 0
    d[fruit] += 1

使用该.get()方法,可以使此代码更紧凑,更清晰:

for fruit in fruits:
    d[fruit] = d.get(fruit, 0) + 1

For what purpose is this function useful?

One particular usage is counting with a dictionary. Let’s assume you want to count the number of occurrences of each element in a given list. The common way to do so is to make a dictionary where keys are elements and values are the number of occurrences.

fruits = ['apple', 'banana', 'peach', 'apple', 'pear']
d = {}
for fruit in fruits:
    if fruit not in d:
        d[fruit] = 0
    d[fruit] += 1

Using the .get() method, you can make this code more compact and clear:

for fruit in fruits:
    d[fruit] = d.get(fruit, 0) + 1

回答 6

使用时要注意的陷阱 .get()

如果字典包含在调用中使用的键,.get()并且其值为None,则该.get()方法将返回None即使提供了默认值。

例如,以下返回None,不是'alt_value'预期的:

d = {'key': None}
d.get('key', 'alt_value')

.get()仅当提供的键不在字典中时,才返回的第二个值,如果该调用的返回值为,则不返回None

A gotcha to be aware of when using .get():

If the dictionary contains the key used in the call to .get() and its value is None, the .get() method will return None even if a default value is supplied.

For example, the following returns None, not 'alt_value' as may be expected:

d = {'key': None}
d.get('key', 'alt_value')

.get()‘s second value is only returned if the key supplied is NOT in the dictionary, not if the return value of that call is None.


回答 7

为什么用dict.get(key)而不是dict [key]?

0.总结

与相比dict[key]dict.get查找关键字时提供了一个后备值。

1.定义

get(key [,default]) 4.内置类型-Python 3.6.4rc1文档

如果key在字典中,则返回key的值,否则返回默认值。如果未提供default,则默认为None,因此此方法永远不会引发KeyError。

d = {"Name": "Harry", "Age": 17}
In [4]: d['gender']
KeyError: 'gender'
In [5]: d.get('gender', 'Not specified, please add it')
Out[5]: 'Not specified, please add it'

2.解决的问题。

如果没有default value,则必须编写繁琐的代码来处理此类异常。

def get_harry_info(key):
    try:
        return "{}".format(d[key])
    except KeyError:
        return 'Not specified, please add it'
In [9]: get_harry_info('Name')
Out[9]: 'Harry'
In [10]: get_harry_info('Gender')
Out[10]: 'Not specified, please add it'

作为一种方便的解决方案,dict.get引入了一个可选的默认值,避免了上面不必要的代码。

3.结论

dict.get 如果字典中没有键,则还有一个附加的默认值选项来处理异常

Why dict.get(key) instead of dict[key]?

0. Summary

Comparing to dict[key], dict.get provides a fallback value when looking up for a key.

1. Definition

get(key[, default]) 4. Built-in Types — Python 3.6.4rc1 documentation

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.

d = {"Name": "Harry", "Age": 17}
In [4]: d['gender']
KeyError: 'gender'
In [5]: d.get('gender', 'Not specified, please add it')
Out[5]: 'Not specified, please add it'

2. Problem it solves.

If without default value, you have to write cumbersome codes to handle such an exception.

def get_harry_info(key):
    try:
        return "{}".format(d[key])
    except KeyError:
        return 'Not specified, please add it'
In [9]: get_harry_info('Name')
Out[9]: 'Harry'
In [10]: get_harry_info('Gender')
Out[10]: 'Not specified, please add it'

As a convenient solution, dict.get introduces an optional default value avoiding above unwiedly codes.

3. Conclusion

dict.get has an additional default value option to deal with exception if key is absent from the dictionary


回答 8

一个不同可能是一个优点,那就是,如果我们正在寻找一个不存在的键,我们将得到None,这与使用方括号表示法不同,在这种情况下,我们将抛出错误:

print(dictionary.get("address")) # None
print(dictionary["address"]) # throws KeyError: 'address'

关于get方法的最后一件很酷的事情是,它接收了一个默认值的附加可选参数,也就是说,如果我们尝试获取学生的分数值,但是该学生没有分数键,我们可以获取改为0。

因此,不要这样做(或类似操作):

score = None
try:
    score = dictionary["score"]
except KeyError:
    score = 0

我们做得到:

score = dictionary.get("score", 0)
# score = 0

One difference, that can be an advantage, is that if we are looking for a key that doesn’t exist we will get None, not like when we use the brackets notation, in which case we will get an error thrown:

print(dictionary.get("address")) # None
print(dictionary["address"]) # throws KeyError: 'address'

Last thing that is cool about the get method, is that it receives an additional optional argument for a default value, that is if we tried to get the score value of a student, but the student doesn’t have a score key we can get a 0 instead.

So instead of doing this (or something similar):

score = None
try:
    score = dictionary["score"]
except KeyError:
    score = 0

We can do this:

score = dictionary.get("score", 0)
# score = 0

回答 9

根据用法应使用此get方法。

例1

In [14]: user_dict = {'type': False}

In [15]: user_dict.get('type', '')

Out[15]: False

In [16]: user_dict.get('type') or ''

Out[16]: ''

例2

In [17]: user_dict = {'type': "lead"}

In [18]: user_dict.get('type') or ''

Out[18]: 'lead'

In [19]: user_dict.get('type', '')

Out[19]: 'lead'

Based on usage should use this get method.

Example1

In [14]: user_dict = {'type': False}

In [15]: user_dict.get('type', '')

Out[15]: False

In [16]: user_dict.get('type') or ''

Out[16]: ''

Example2

In [17]: user_dict = {'type': "lead"}

In [18]: user_dict.get('type') or ''

Out[18]: 'lead'

In [19]: user_dict.get('type', '')

Out[19]: 'lead'

如何按字典值对字典列表进行排序?

问题:如何按字典值对字典列表进行排序?

我有一个字典列表,希望每个项目都按特定的属性值排序。

考虑下面的数组,

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

当排序name,应该成为

[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

I have a list of dictionaries and want each item to be sorted by a specific property values.

Take into consideration the array below,

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

When sorted by name, should become

[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

回答 0

使用密钥而不是cmp看起来更干净:

newlist = sorted(list_to_be_sorted, key=lambda k: k['name']) 

或如JFSebastian和其他人所建议的,

from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name')) 

为了完整性(如fitzgeraldsteele的评论中指出的那样),请添加reverse=True降序排列

newlist = sorted(l, key=itemgetter('name'), reverse=True)

It may look cleaner using a key instead a cmp:

newlist = sorted(list_to_be_sorted, key=lambda k: k['name']) 

or as J.F.Sebastian and others suggested,

from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name')) 

For completeness (as pointed out in comments by fitzgeraldsteele), add reverse=True to sort descending

newlist = sorted(l, key=itemgetter('name'), reverse=True)

回答 1

import operator

通过key =’name’对字典列表进行排序:

list_of_dicts.sort(key=operator.itemgetter('name'))

按照key =’age’对字典列表进行排序:

list_of_dicts.sort(key=operator.itemgetter('age'))
import operator

To sort the list of dictionaries by key=’name’:

list_of_dicts.sort(key=operator.itemgetter('name'))

To sort the list of dictionaries by key=’age’:

list_of_dicts.sort(key=operator.itemgetter('age'))

回答 2

my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

my_list.sort(lambda x,y : cmp(x['name'], y['name']))

my_list 现在将成为您想要的。

(3年后)进行编辑以添加:

新的key论点更加有效和整洁。更好的答案现在看起来像:

my_list = sorted(my_list, key=lambda k: k['name'])

…IMO比operator.itemgetterymmv 更容易理解。

my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

my_list.sort(lambda x,y : cmp(x['name'], y['name']))

my_list will now be what you want.

(3 years later) Edited to add:

The new key argument is more efficient and neater. A better answer now looks like:

my_list = sorted(my_list, key=lambda k: k['name'])

…the lambda is, IMO, easier to understand than operator.itemgetter, but YMMV.


回答 3

如果要按多个键对列表进行排序,可以执行以下操作:

my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))

它相当骇人听闻,因为它依赖于将值转换为单个字符串表示形式进行比较,但是它对于包括负数在内的数字也可以正常工作(尽管如果使用数字,则需要使用零填充来适当格式化字符串)

If you want to sort the list by multiple keys you can do the following:

my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))

It is rather hackish, since it relies on converting the values into a single string representation for comparison, but it works as expected for numbers including negative ones (although you will need to format your string appropriately with zero paddings if you are using numbers)


回答 4

import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))

‘key’用于按任意值排序,’itemgetter’将该值设置为每个项目的’name’属性。

import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))

‘key’ is used to sort by an arbitrary value and ‘itemgetter’ sets that value to each item’s ‘name’ attribute.


回答 5

a = [{'name':'Homer', 'age':39}, ...]

# This changes the list a
a.sort(key=lambda k : k['name'])

# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name']) 
a = [{'name':'Homer', 'age':39}, ...]

# This changes the list a
a.sort(key=lambda k : k['name'])

# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name']) 

回答 6

我想你的意思是:

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

排序如下:

sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))

I guess you’ve meant:

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

This would be sorted like this:

sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))

回答 7

您可以使用自定义比较函数,也可以传入一个计算自定义排序键的函数。通常,这样做效率更高,因为每个项只计算一次密钥,而比较函数将被调用多次。

您可以这样进行:

def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)

但是标准库包含用于获取任意对象项的通用例程:itemgetter。因此,请尝试以下操作:

from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))

You could use a custom comparison function, or you could pass in a function that calculates a custom sort key. That’s usually more efficient as the key is only calculated once per item, while the comparison function would be called many more times.

You could do it this way:

def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)

But the standard library contains a generic routine for getting items of arbitrary objects: itemgetter. So try this instead:

from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))

回答 8

使用Perl的Schwartzian变换,

py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]

>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]

有关Perl Schwartzian变换的更多信息

在计算机科学中,Schwartzian变换是一种Perl编程习惯用法,用于提高对项目列表进行排序的效率。当排序实际上是基于元素的某个属性(键)的排序时,此惯用法适用于基于比较的排序,其中计算该属性是一项应执行最少次数的密集操作。Schwartzian转换的显着之处在于它不使用命名的临时数组。

Using Schwartzian transform from Perl,

py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

do

sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]

gives

>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]

More on Perl Schwartzian transform

In computer science, the Schwartzian transform is a Perl programming idiom used to improve the efficiency of sorting a list of items. This idiom is appropriate for comparison-based sorting when the ordering is actually based on the ordering of a certain property (the key) of the elements, where computing that property is an intensive operation that should be performed a minimal number of times. The Schwartzian Transform is notable in that it does not use named temporary arrays.


回答 9

您必须实现自己的比较功能,该功能将通过名称键的值比较字典。请参阅从PythonInfo Wiki对Mini-HOW TO进行排序

You have to implement your own comparison function that will compare the dictionaries by values of name keys. See Sorting Mini-HOW TO from PythonInfo Wiki


回答 10

有时我们需要使用lower()例如

lists = [{'name':'Homer', 'age':39},
  {'name':'Bart', 'age':10},
  {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

sometime we need to use lower() for example

lists = [{'name':'Homer', 'age':39},
  {'name':'Bart', 'age':10},
  {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

回答 11

这是另一种通用解决方案-它按键和值对dict的元素进行排序。它的优点-无需指定键,并且如果某些词典中缺少某些键,它将仍然有效。

def sort_key_func(item):
    """ helper function used to sort list of dicts

    :param item: dict
    :return: sorted list of tuples (k, v)
    """
    pairs = []
    for k, v in item.items():
        pairs.append((k, v))
    return sorted(pairs)
sorted(A, key=sort_key_func)

Here is the alternative general solution – it sorts elements of dict by keys and values. The advantage of it – no need to specify keys, and it would still work if some keys are missing in some of dictionaries.

def sort_key_func(item):
    """ helper function used to sort list of dicts

    :param item: dict
    :return: sorted list of tuples (k, v)
    """
    pairs = []
    for k, v in item.items():
        pairs.append((k, v))
    return sorted(pairs)
sorted(A, key=sort_key_func)

回答 12

使用pandas包是另一种方法,尽管它的大规模运行比其他人提出的更传统的方法要慢得多:

import pandas as pd

listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()

以下是一些小型词典和大型(100k +)字典的一些基准值:

setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))

#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807

Using the pandas package is another method, though it’s runtime at large scale is much slower than the more traditional methods proposed by others:

import pandas as pd

listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()

Here are some benchmark values for a tiny list and a large (100k+) list of dicts:

setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))

#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807

回答 13

如果你不需要原来listdictionaries,你可以用修改就地sort()使用自定义按键功能的方法。

按键功能:

def get_name(d):
    """ Return the value of a key in a dictionary. """

    return d["name"]

list进行排序:

data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]

就地排序:

data_one.sort(key=get_name)

如果您需要原始的list,请调用将sorted()函数传递给的函数list和键函数,然后将返回的排序list后的变量分配给新变量:

data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)

印刷data_onenew_data

>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]

If you do not need the original list of dictionaries, you could modify it in-place with sort() method using a custom key function.

Key function:

def get_name(d):
    """ Return the value of a key in a dictionary. """

    return d["name"]

The list to be sorted:

data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]

Sorting it in-place:

data_one.sort(key=get_name)

If you need the original list, call the sorted() function passing it the list and the key function, then assign the returned sorted list to a new variable:

data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)

Printing data_one and new_data.

>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]

回答 14

假设我有一本D包含以下内容的字典。要进行排序,只需使用sort中的key参数来传递自定义函数,如下所示:

D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
    return tuple[1]

sorted(D.items(), key = get_count, reverse=True)
# or
sorted(D.items(), key = lambda x: x[1], reverse=True)  # avoiding get_count function call

检查这个出来。

Let’s say I have a dictionary D with elements below. To sort just use key argument in sorted to pass custom function as below :

D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
    return tuple[1]

sorted(D.items(), key = get_count, reverse=True)
# or
sorted(D.items(), key = lambda x: x[1], reverse=True)  # avoiding get_count function call

Check this out.


回答 15

我一直是lambda过滤器的忠实拥护者,但是如果您考虑时间复杂性,则不是最佳选择

第一选择

sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# returns list of values

第二选择

list_to_sort.sort(key=operator.itemgetter('name'))
#edits the list, does not return a new list

快速比较执行时间

# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"

1000000次循环,最好为3:每个循环0.736微秒

# Second option 
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"

1000000次循环,最好为3:每个循环0.438微秒

I have been a big fan of filter w/ lambda however it is not best option if you considering time complexity

First option

sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# returns list of values

Second option

list_to_sort.sort(key=operator.itemgetter('name'))
#edits the list, does not return a new list

Fast comparison of exec times

# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"

1000000 loops, best of 3: 0.736 usec per loop

# Second option 
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"

1000000 loops, best of 3: 0.438 usec per loop


回答 16

如果需要考虑性能,我会使用内置函数operator.itemgetter来代替lambda手工函数,而使用内置函数来代替。该itemgetter功能似乎比lambda根据我的测试快约20%。

https://wiki.python.org/moin/PythonSpeed

同样,内置函数比手工生成的等效函数运行得更快。例如,map(operator.add,v1,v2)比map(lambda x,y:x + y,v1,v2)快。

这是使用lambdavs 进行排序速度的比较itemgetter

import random
import operator

# create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]

# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Check that each technique produces same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True

两种技术都以相同的顺序对列表进行排序(通过执行代码块中的final语句进行验证),但是一种方法要快一些。

If performance is a concern, I would use operator.itemgetter instead of lambda as built-in functions perform faster than hand-crafted functions. The itemgetter function seems to perform approximately 20% faster than lambda based on my testing.

From https://wiki.python.org/moin/PythonSpeed:

Likewise, the builtin functions run faster than hand-built equivalents. For example, map(operator.add, v1, v2) is faster than map(lambda x,y: x+y, v1, v2).

Here is a comparison of sorting speed using lambda vs itemgetter.

import random
import operator

# create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]

# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Check that each technique produces same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True

Both techniques sort the list in the same order (verified by execution of the final statement in the code block) but one is a little faster.


回答 17

您可以使用以下代码

sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])

You may use the following code

sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])

确定对象的类型?

问题:确定对象的类型?

有没有一种简单的方法来确定变量是列表,字典还是其他?我回来的对象可能是任何一种类型,我需要能够分辨出两者之间的区别。

Is there a simple way to determine if a variable is a list, dictionary, or something else? I am getting an object back that may be either type and I need to be able to tell the difference.


回答 0

有两个内置函数可以帮助您识别对象的类型。您可以使用type() ,如果你需要一个对象的确切类型,并isinstance()检查对象的反对的东西类型。通常,您希望使用isistance()大多数时间,因为它非常健壮并且还支持类型继承。


要获取对象的实际类型,请使用内置type()函数。将对象作为唯一参数传递将返回该对象的类型对象:

>>> type([]) is list
True
>>> type({}) is dict
True
>>> type('') is str
True
>>> type(0) is int
True

当然,这也适用于自定义类型:

>>> class Test1 (object):
        pass
>>> class Test2 (Test1):
        pass
>>> a = Test1()
>>> b = Test2()
>>> type(a) is Test1
True
>>> type(b) is Test2
True

请注意,type()这只会返回对象的直接类型,而不能告诉您类型继承。

>>> type(b) is Test1
False

为此,您应该使用该isinstance功能。当然,这也适用于内置类型:

>>> isinstance(b, Test1)
True
>>> isinstance(b, Test2)
True
>>> isinstance(a, Test1)
True
>>> isinstance(a, Test2)
False
>>> isinstance([], list)
True
>>> isinstance({}, dict)
True

isinstance()通常是确保对象类型的首选方法,因为它还将接受派生类型。因此,除非您实际需要类型对象(无论出于何种原因),否则使用isinstance()优先于type()

第二个参数isinstance()还接受类型的元组,因此可以一次检查多个类型。isinstance如果对象属于以下任何类型,则将返回true:

>>> isinstance([], (tuple, list, set))
True

There are two built-in functions that help you identify the type of an object. You can use type() if you need the exact type of an object, and isinstance() to check an object’s type against something. Usually, you want to use isistance() most of the times since it is very robust and also supports type inheritance.


To get the actual type of an object, you use the built-in type() function. Passing an object as the only parameter will return the type object of that object:

>>> type([]) is list
True
>>> type({}) is dict
True
>>> type('') is str
True
>>> type(0) is int
True

This of course also works for custom types:

>>> class Test1 (object):
        pass
>>> class Test2 (Test1):
        pass
>>> a = Test1()
>>> b = Test2()
>>> type(a) is Test1
True
>>> type(b) is Test2
True

Note that type() will only return the immediate type of the object, but won’t be able to tell you about type inheritance.

>>> type(b) is Test1
False

To cover that, you should use the isinstance function. This of course also works for built-in types:

>>> isinstance(b, Test1)
True
>>> isinstance(b, Test2)
True
>>> isinstance(a, Test1)
True
>>> isinstance(a, Test2)
False
>>> isinstance([], list)
True
>>> isinstance({}, dict)
True

isinstance() is usually the preferred way to ensure the type of an object because it will also accept derived types. So unless you actually need the type object (for whatever reason), using isinstance() is preferred over type().

The second parameter of isinstance() also accepts a tuple of types, so it’s possible to check for multiple types at once. isinstance will then return true, if the object is of any of those types:

>>> isinstance([], (tuple, list, set))
True

回答 1

您可以使用type()

>>> a = []
>>> type(a)
<type 'list'>
>>> f = ()
>>> type(f)
<type 'tuple'>

You can do that using type():

>>> a = []
>>> type(a)
<type 'list'>
>>> f = ()
>>> type(f)
<type 'tuple'>

回答 2

使用tryexcept块可能更Pythonic 。这样一来,如果你有这叫声也像列表,或叫声也像字典类,它会循规蹈矩,无论什么的类型真的是。

为了明确起见,“告诉变量类型”之间的差异的首选方法是使用“ 鸭子类型”:只要变量响应的方法(和返回类型)是子例程所期望的,则将其视为期望的成为。例如,如果您有一个用getattr和重载方括号运算符的类setattr,但使用了一些有趣的内部方案,那么如果它试图模仿的话,它就适合充当字典。

type(A) is type(B)检查的另一个问题是,如果A是的子类B,它将以false编程方式求值,您希望它是的时间true。如果对象是列表的子类,则它应像列表一样工作:检查其他答案中提供的类型将防止此情况。(isinstance但是可以)。

It might be more Pythonic to use a tryexcept block. That way, if you have a class which quacks like a list, or quacks like a dict, it will behave properly regardless of what its type really is.

To clarify, the preferred method of “telling the difference” between variable types is with something called duck typing: as long as the methods (and return types) that a variable responds to are what your subroutine expects, treat it like what you expect it to be. For example, if you have a class that overloads the bracket operators with getattr and setattr, but uses some funny internal scheme, it would be appropriate for it to behave as a dictionary if that’s what it’s trying to emulate.

The other problem with the type(A) is type(B) checking is that if A is a subclass of B, it evaluates to false when, programmatically, you would hope it would be true. If an object is a subclass of a list, it should work like a list: checking the type as presented in the other answer will prevent this. (isinstance will work, however).


回答 3

在对象的实例上,您还具有:

__class__

属性。这是从Python 3.3控制台获取的示例

>>> str = "str"
>>> str.__class__
<class 'str'>
>>> i = 2
>>> i.__class__
<class 'int'>
>>> class Test():
...     pass
...
>>> a = Test()
>>> a.__class__
<class '__main__.Test'>

请注意,在python 3.x和New-Style类(可从Python 2.6中可选)中,类和类型已合并,这有时会导致意外结果。主要是因为这个原因,我最喜欢的测试类型/类的方法是内置函数中的isinstance

On instances of object you also have the:

__class__

attribute. Here is a sample taken from Python 3.3 console

>>> str = "str"
>>> str.__class__
<class 'str'>
>>> i = 2
>>> i.__class__
<class 'int'>
>>> class Test():
...     pass
...
>>> a = Test()
>>> a.__class__
<class '__main__.Test'>

Beware that in python 3.x and in New-Style classes (aviable optionally from Python 2.6) class and type have been merged and this can sometime lead to unexpected results. Mainly for this reason my favorite way of testing types/classes is to the isinstance built in function.


回答 4

确定Python对象的类型

确定对象的类型 type

>>> obj = object()
>>> type(obj)
<class 'object'>

尽管可行,但请避免使用双下划线属性,例如__class__-它们在语义上不公开,并且在这种情况下(也许不是),内置函数通常具有更好的行为。

>>> obj.__class__ # avoid this!
<class 'object'>

类型检查

有没有一种简单的方法来确定变量是列表,字典还是其他?我回来的对象可能是任何一种类型,我需要能够分辨出两者之间的区别。

嗯,这是一个不同的问题,不要使用type-use isinstance

def foo(obj):
    """given a string with items separated by spaces, 
    or a list or tuple, 
    do something sensible
    """
    if isinstance(obj, str):
        obj = str.split()
    return _foo_handles_only_lists_or_tuples(obj)

这涵盖了您的用户通过子类来做一些聪明或明智的事情的情况str-根据Liskov Substitution的原理,您希望能够在不破坏代码的情况下使用子类实例-并isinstance支持这一点。

使用抽象

甚至更好的是,您可能会从collections或寻找特定的抽象基类numbers

from collections import Iterable
from numbers import Number

def bar(obj):
    """does something sensible with an iterable of numbers, 
    or just one number
    """
    if isinstance(obj, Number): # make it a 1-tuple
        obj = (obj,)
    if not isinstance(obj, Iterable):
        raise TypeError('obj must be either a number or iterable of numbers')
    return _bar_sensible_with_iterable(obj)

或者只是不明确地进行类型检查

或者,也许最重要的是,使用鸭式输入,而不要显式地检查代码。鸭式打字以更高的雅致和更少的冗长性支持Liskov Substitution。

def baz(obj):
    """given an obj, a dict (or anything with an .items method) 
    do something sensible with each key-value pair
    """
    for key, value in obj.items():
        _baz_something_sensible(key, value)

结论

  • 使用type真正得到一个实例的类。
  • 使用isinstance显式检查实际的子类或注册的抽象。
  • 只是避免在有意义的地方进行类型检查。

Determine the type of a Python object

Determine the type of an object with type

>>> obj = object()
>>> type(obj)
<class 'object'>

Although it works, avoid double underscore attributes like __class__ – they’re not semantically public, and, while perhaps not in this case, the builtin functions usually have better behavior.

>>> obj.__class__ # avoid this!
<class 'object'>

type checking

Is there a simple way to determine if a variable is a list, dictionary, or something else? I am getting an object back that may be either type and I need to be able to tell the difference.

Well that’s a different question, don’t use type – use isinstance:

def foo(obj):
    """given a string with items separated by spaces, 
    or a list or tuple, 
    do something sensible
    """
    if isinstance(obj, str):
        obj = str.split()
    return _foo_handles_only_lists_or_tuples(obj)

This covers the case where your user might be doing something clever or sensible by subclassing str – according to the principle of Liskov Substitution, you want to be able to use subclass instances without breaking your code – and isinstance supports this.

Use Abstractions

Even better, you might look for a specific Abstract Base Class from collections or numbers:

from collections import Iterable
from numbers import Number

def bar(obj):
    """does something sensible with an iterable of numbers, 
    or just one number
    """
    if isinstance(obj, Number): # make it a 1-tuple
        obj = (obj,)
    if not isinstance(obj, Iterable):
        raise TypeError('obj must be either a number or iterable of numbers')
    return _bar_sensible_with_iterable(obj)

Or Just Don’t explicitly Type-check

Or, perhaps best of all, use duck-typing, and don’t explicitly type-check your code. Duck-typing supports Liskov Substitution with more elegance and less verbosity.

def baz(obj):
    """given an obj, a dict (or anything with an .items method) 
    do something sensible with each key-value pair
    """
    for key, value in obj.items():
        _baz_something_sensible(key, value)

Conclusion

  • Use type to actually get an instance’s class.
  • Use isinstance to explicitly check for actual subclasses or registered abstractions.
  • And just avoid type-checking where it makes sense.

回答 5

您可以使用type()isinstance()

>>> type([]) is list
True

警告您可以list通过在当前作用域中分配相同名称的变量来破坏文件或其他任何类型。

>>> the_d = {}
>>> t = lambda x: "aight" if type(x) is dict else "NOPE"
>>> t(the_d) 'aight'
>>> dict = "dude."
>>> t(the_d) 'NOPE'

在上方,我们看到dict将其重新分配给字符串,因此进行了测试:

type({}) is dict

…失败。

要解决此问题并type()谨慎使用:

>>> import __builtin__
>>> the_d = {}
>>> type({}) is dict
True
>>> dict =""
>>> type({}) is dict
False
>>> type({}) is __builtin__.dict
True

You can use type() or isinstance().

>>> type([]) is list
True

Be warned that you can clobber list or any other type by assigning a variable in the current scope of the same name.

>>> the_d = {}
>>> t = lambda x: "aight" if type(x) is dict else "NOPE"
>>> t(the_d) 'aight'
>>> dict = "dude."
>>> t(the_d) 'NOPE'

Above we see that dict gets reassigned to a string, therefore the test:

type({}) is dict

…fails.

To get around this and use type() more cautiously:

>>> import __builtin__
>>> the_d = {}
>>> type({}) is dict
True
>>> dict =""
>>> type({}) is dict
False
>>> type({}) is __builtin__.dict
True

回答 6

尽管问题已经很老了,但我偶然发现了这个问题,同时自己找到了正确的方法,并且我认为仍然需要澄清一下,至少对于Python 2.x(没有检查Python 3,但是由于经典类出现了问题,在这样的版本上消失了,可能没有关系)。

在这里,我试图回答标题的问题:如何确定任意对象的类型?在许多评论和答案中,关于使用或不使用isinstance的其他建议也可以,但是我没有解决这些问题。

type()方法的主要问题是,它不适用于旧式实例

class One:
    pass

class Two:
    pass


o = One()
t = Two()

o_type = type(o)
t_type = type(t)

print "Are o and t instances of the same class?", o_type is t_type

执行此代码片段将生成:

Are o and t instances of the same class? True

我认为这不是大多数人所期望的。

这种__class__方法最接近正确性,但是在一种关键情况下不起作用:当传入的对象是旧式(而不是实例!)时,因为这些对象缺少此类属性。

这是我能想到的最小的代码片段,以一致的方式满足了此类合法问题:

#!/usr/bin/env python
from types import ClassType
#we adopt the null object pattern in the (unlikely) case
#that __class__ is None for some strange reason
_NO_CLASS=object()
def get_object_type(obj):
    obj_type = getattr(obj, "__class__", _NO_CLASS)
    if obj_type is not _NO_CLASS:
        return obj_type
    # AFAIK the only situation where this happens is an old-style class
    obj_type = type(obj)
    if obj_type is not ClassType:
        raise ValueError("Could not determine object '{}' type.".format(obj_type))
    return obj_type

While the questions is pretty old, I stumbled across this while finding out a proper way myself, and I think it still needs clarifying, at least for Python 2.x (did not check on Python 3, but since the issue arises with classic classes which are gone on such version, it probably doesn’t matter).

Here I’m trying to answer the title’s question: how can I determine the type of an arbitrary object? Other suggestions about using or not using isinstance are fine in many comments and answers, but I’m not addressing those concerns.

The main issue with the type() approach is that it doesn’t work properly with old-style instances:

class One:
    pass

class Two:
    pass


o = One()
t = Two()

o_type = type(o)
t_type = type(t)

print "Are o and t instances of the same class?", o_type is t_type

Executing this snippet would yield:

Are o and t instances of the same class? True

Which, I argue, is not what most people would expect.

The __class__ approach is the most close to correctness, but it won’t work in one crucial case: when the passed-in object is an old-style class (not an instance!), since those objects lack such attribute.

This is the smallest snippet of code I could think of that satisfies such legitimate question in a consistent fashion:

#!/usr/bin/env python
from types import ClassType
#we adopt the null object pattern in the (unlikely) case
#that __class__ is None for some strange reason
_NO_CLASS=object()
def get_object_type(obj):
    obj_type = getattr(obj, "__class__", _NO_CLASS)
    if obj_type is not _NO_CLASS:
        return obj_type
    # AFAIK the only situation where this happens is an old-style class
    obj_type = type(obj)
    if obj_type is not ClassType:
        raise ValueError("Could not determine object '{}' type.".format(obj_type))
    return obj_type

回答 7

小心使用isinstance

isinstance(True, bool)
True
>>> isinstance(True, int)
True

但是输入

type(True) == bool
True
>>> type(True) == int
False

be careful using isinstance

isinstance(True, bool)
True
>>> isinstance(True, int)
True

but type

type(True) == bool
True
>>> type(True) == int
False

回答 8

除了前面的答案外,值得一提的是collections.abc它的存在还包含一些补充鸭类的抽象基类(ABC)。

例如,与其明确地检查某物是否为列表,不如:

isinstance(my_obj, list)

如果您只想查看自己拥有的对象是否允许获取物品,可以使用collections.abc.Sequence

from collections.abc import Sequence
isinstance(my_obj, Sequence) 

如果您对允许获取,设置删除项目(即可序列)的对象非常感兴趣,则可以选择collections.abc.MutableSequence

许多其它的ABC被定义在那里,Mapping对于可以使用的地图,对象IterableCallable,等等。有关这些文件的完整列表,请参见的文档collections.abc

As an aside to the previous answers, it’s worth mentioning the existence of collections.abc which contains several abstract base classes (ABCs) that complement duck-typing.

For example, instead of explicitly checking if something is a list with:

isinstance(my_obj, list)

you could, if you’re only interested in seeing if the object you have allows getting items, use collections.abc.Sequence:

from collections.abc import Sequence
isinstance(my_obj, Sequence) 

if you’re strictly interested in objects that allow getting, setting and deleting items (i.e mutable sequences), you’d opt for collections.abc.MutableSequence.

Many other ABCs are defined there, Mapping for objects that can be used as maps, Iterable, Callable, et cetera. A full list of all these can be seen in the documentation for collections.abc.


回答 9

通常,您可以从具有类名称的对象中提取字符串,

str_class = object.__class__.__name__

并进行比较

if str_class == 'dict':
    # blablabla..
elif str_class == 'customclass':
    # blebleble..

In general you can extract a string from object with the class name,

str_class = object.__class__.__name__

and using it for comparison,

if str_class == 'dict':
    # blablabla..
elif str_class == 'customclass':
    # blebleble..

回答 10

在许多实际情况下,而不是使用typeisinstance也可以使用@functools.singledispatch,这是用来定义的通用功能功能实现用于不同类型的同一操作的多个函数构成)。

换句话说,当您具有如下代码时,您将希望使用它:

def do_something(arg):
    if isinstance(arg, int):
        ... # some code specific to processing integers
    if isinstance(arg, str):
        ... # some code specific to processing strings
    if isinstance(arg, list):
        ... # some code specific to processing lists
    ...  # etc

这是一个如何工作的小例子:

from functools import singledispatch


@singledispatch
def say_type(arg):
    raise NotImplementedError(f"I don't work with {type(arg)}")


@say_type.register
def _(arg: int):
    print(f"{arg} is an integer")


@say_type.register
def _(arg: bool):
    print(f"{arg} is a boolean")
>>> say_type(0)
0 is an integer
>>> say_type(False)
False is a boolean
>>> say_type(dict())
# long error traceback ending with:
NotImplementedError: I don't work with <class 'dict'>

另外,我们可以使用抽象类一次覆盖几种类型:

from collections.abc import Sequence


@say_type.register
def _(arg: Sequence):
    print(f"{arg} is a sequence!")
>>> say_type([0, 1, 2])
[0, 1, 2] is a sequence!
>>> say_type((1, 2, 3))
(1, 2, 3) is a sequence!

In many practical cases instead of using type or isinstance you can also use @functools.singledispatch, which is used to define generic functions (function composed of multiple functions implementing the same operation for different types).

In other words, you would want to use it when you have a code like the following:

def do_something(arg):
    if isinstance(arg, int):
        ... # some code specific to processing integers
    if isinstance(arg, str):
        ... # some code specific to processing strings
    if isinstance(arg, list):
        ... # some code specific to processing lists
    ...  # etc

Here is a small example of how it works:

from functools import singledispatch


@singledispatch
def say_type(arg):
    raise NotImplementedError(f"I don't work with {type(arg)}")


@say_type.register
def _(arg: int):
    print(f"{arg} is an integer")


@say_type.register
def _(arg: bool):
    print(f"{arg} is a boolean")
>>> say_type(0)
0 is an integer
>>> say_type(False)
False is a boolean
>>> say_type(dict())
# long error traceback ending with:
NotImplementedError: I don't work with <class 'dict'>

Additionaly we can use abstract classes to cover several types at once:

from collections.abc import Sequence


@say_type.register
def _(arg: Sequence):
    print(f"{arg} is a sequence!")
>>> say_type([0, 1, 2])
[0, 1, 2] is a sequence!
>>> say_type((1, 2, 3))
(1, 2, 3) is a sequence!

回答 11

type()是比更好的解决方案isinstance(),特别是在booleans

TrueFalse只是关键字,平均10Python编写的。从而,

isinstance(True, int)

isinstance(False, int)

都回来了True。两个布尔值都是整数的实例。type()但是,它更聪明:

type(True) == int

返回False

type() is a better solution than isinstance(), particularly for booleans:

True and False are just keywords that mean 1 and 0 in python. Thus,

isinstance(True, int)

and

isinstance(False, int)

both return True. Both booleans are an instance of an integer. type(), however, is more clever:

type(True) == int

returns False.


如何从Python字典中删除键?

问题:如何从Python字典中删除键?

从字典中删除键时,我使用:

if 'key' in my_dict:
    del my_dict['key']

有没有一种方法可以做到这一点?

When deleting a key from a dictionary, I use:

if 'key' in my_dict:
    del my_dict['key']

Is there a one line way of doing this?


回答 0

要删除键而不管它是否在字典中,请使用以下两个参数的形式dict.pop()

my_dict.pop('key', None)

my_dict[key]如果key字典中存在,则返回,None否则返回。如果第二个参数未指定(即my_dict.pop('key'))并且key不存在,KeyError则引发a。

要删除肯定存在的密钥,您还可以使用

del my_dict['key']

KeyError如果密钥不在字典中,则将引发a 。

To delete a key regardless of whether it is in the dictionary, use the two-argument form of dict.pop():

my_dict.pop('key', None)

This will return my_dict[key] if key exists in the dictionary, and None otherwise. If the second parameter is not specified (ie. my_dict.pop('key')) and key does not exist, a KeyError is raised.

To delete a key that is guaranteed to exist, you can also use

del my_dict['key']

This will raise a KeyError if the key is not in the dictionary.


回答 1

专门回答“是否有一种统一的方法?”

if 'key' in my_dict: del my_dict['key']

…嗯,你问过 ;-)

你应该考虑,虽然,从删除对象的这种方式dict不是原子 -它是可能的,'key'可能是在my_dict该过程中if的语句,但是可以删除之前del被执行,在这种情况下del将失败,KeyError。鉴于此,最安全的使用dict.pop方式是

try:
    del my_dict['key']
except KeyError:
    pass

当然,这绝对不是单线的。

Specifically to answer “is there a one line way of doing this?”

if 'key' in my_dict: del my_dict['key']

…well, you asked ;-)

You should consider, though, that this way of deleting an object from a dict is not atomic—it is possible that 'key' may be in my_dict during the if statement, but may be deleted before del is executed, in which case del will fail with a KeyError. Given this, it would be safest to either use dict.pop or something along the lines of

try:
    del my_dict['key']
except KeyError:
    pass

which, of course, is definitely not a one-liner.


回答 2

我花了一些时间弄清楚究竟my_dict.pop("key", None)在做什么。因此,我将其添加为答案以节省其他Google搜索时间:

pop(key[, default])

如果key在字典中,请删除它并返回其值,否则返回default。如果未提供默认值并且字典中没有KeyError则引发a。

文献资料

It took me some time to figure out what exactly my_dict.pop("key", None) is doing. So I’ll add this as an answer to save others Googling time:

pop(key[, default])

If key is in the dictionary, remove it and return its value, else return default. If default is not given and key is not in the dictionary, a KeyError is raised.

Documentation


回答 3

del my_dict[key]my_dict.pop(key)在键存在时从字典中删除键要快一些

>>> import timeit
>>> setup = "d = {i: i for i in range(100000)}"

>>> timeit.timeit("del d[3]", setup=setup, number=1)
1.79e-06
>>> timeit.timeit("d.pop(3)", setup=setup, number=1)
2.09e-06
>>> timeit.timeit("d2 = {key: val for key, val in d.items() if key != 3}", setup=setup, number=1)
0.00786

但是,当密钥不存在时,它会if key in my_dict: del my_dict[key]比稍快一点my_dict.pop(key, None)。两者都至少比快三倍deltry/ except语句:

>>> timeit.timeit("if 'missing key' in d: del d['missing key']", setup=setup)
0.0229
>>> timeit.timeit("d.pop('missing key', None)", setup=setup)
0.0426
>>> try_except = """
... try:
...     del d['missing key']
... except KeyError:
...     pass
... """
>>> timeit.timeit(try_except, setup=setup)
0.133

del my_dict[key] is slightly faster than my_dict.pop(key) for removing a key from a dictionary when the key exists

>>> import timeit
>>> setup = "d = {i: i for i in range(100000)}"

>>> timeit.timeit("del d[3]", setup=setup, number=1)
1.79e-06
>>> timeit.timeit("d.pop(3)", setup=setup, number=1)
2.09e-06
>>> timeit.timeit("d2 = {key: val for key, val in d.items() if key != 3}", setup=setup, number=1)
0.00786

But when the key doesn’t exist if key in my_dict: del my_dict[key] is slightly faster than my_dict.pop(key, None). Both are at least three times faster than del in a try/except statement:

>>> timeit.timeit("if 'missing key' in d: del d['missing key']", setup=setup)
0.0229
>>> timeit.timeit("d.pop('missing key', None)", setup=setup)
0.0426
>>> try_except = """
... try:
...     del d['missing key']
... except KeyError:
...     pass
... """
>>> timeit.timeit(try_except, setup=setup)
0.133

回答 4

如果您需要在一行代码中从字典中删除很多键,我认为使用map()非常简洁且Python可读:

myDict = {'a':1,'b':2,'c':3,'d':4}
map(myDict.pop, ['a','c']) # The list of keys to remove
>>> myDict
{'b': 2, 'd': 4}

并且,如果您需要在弹出字典中没有的值的地方捕获错误,请在map()中使用lambda,如下所示:

map(lambda x: myDict.pop(x,None), ['a', 'c', 'e'])
[1, 3, None] # pop returns
>>> myDict
{'b': 2, 'd': 4}

或中的python3,您必须改为使用列表推导:

[myDict.pop(x, None) for x in ['a', 'c', 'e']]

有用。即使myDict没有“ e”键,“ e”也不会引起错误。

If you need to remove a lot of keys from a dictionary in one line of code, I think using map() is quite succinct and Pythonic readable:

myDict = {'a':1,'b':2,'c':3,'d':4}
map(myDict.pop, ['a','c']) # The list of keys to remove
>>> myDict
{'b': 2, 'd': 4}

And if you need to catch errors where you pop a value that isn’t in the dictionary, use lambda inside map() like this:

map(lambda x: myDict.pop(x,None), ['a', 'c', 'e'])
[1, 3, None] # pop returns
>>> myDict
{'b': 2, 'd': 4}

or in python3, you must use a list comprehension instead:

[myDict.pop(x, None) for x in ['a', 'c', 'e']]

It works. And ‘e’ did not cause an error, even though myDict did not have an ‘e’ key.


回答 5

您可以使用字典理解来创建新字典,并删除该键:

>>> my_dict = {k: v for k, v in my_dict.items() if k != 'key'}

您可以按条件删除。如果key不存在,则没有错误。

You can use a dictionary comprehension to create a new dictionary with that key removed:

>>> my_dict = {k: v for k, v in my_dict.items() if k != 'key'}

You can delete by conditions. No error if key doesn’t exist.


回答 6

使用“ del”关键字:

del dict[key]

Using the “del” keyword:

del dict[key]

回答 7

我们可以通过以下几种方法从Python字典中删除键。

使用del关键字;这几乎与您所采用的方法相同-

 myDict = {'one': 100, 'two': 200, 'three': 300 }
 print(myDict)  # {'one': 100, 'two': 200, 'three': 300}
 if myDict.get('one') : del myDict['one']
 print(myDict)  # {'two': 200, 'three': 300}

要么

我们可以像下面这样:

但是请记住,在此过程中,它实际上不会从字典中删除任何键,而不会从该字典中排除特定的键。另外,我观察到它返回的字典与的顺序不同myDict

myDict = {'one': 100, 'two': 200, 'three': 300, 'four': 400, 'five': 500}
{key:value for key, value in myDict.items() if key != 'one'}

如果我们在外壳中运行它,它将执行类似的操作{'five': 500, 'four': 400, 'three': 300, 'two': 200}-请注意,它与的顺序不同myDict。再次,如果我们尝试打印myDict,那么我们可以看到所有键,包括通过这种方法从字典中排除的键。但是,我们可以通过将以下语句分配给变量来创建新字典:

var = {key:value for key, value in myDict.items() if key != 'one'}

现在,如果我们尝试打印它,它将遵循父命令:

print(var) # {'two': 200, 'three': 300, 'four': 400, 'five': 500}

要么

使用pop()方法。

myDict = {'one': 100, 'two': 200, 'three': 300}
print(myDict)

if myDict.get('one') : myDict.pop('one')
print(myDict)  # {'two': 200, 'three': 300}

del和之间的区别在于pop,使用pop()方法,我们实际上可以根据需要存储键的值,如下所示:

myDict = {'one': 100, 'two': 200, 'three': 300}
if myDict.get('one') : var = myDict.pop('one')
print(myDict) # {'two': 200, 'three': 300}
print(var)    # 100

如果您觉得有用,请叉要点以备将来参考。

We can delete a key from a Python dictionary by the some following approaches.

Using the del keyword; it’s almost the same approach like you did though –

 myDict = {'one': 100, 'two': 200, 'three': 300 }
 print(myDict)  # {'one': 100, 'two': 200, 'three': 300}
 if myDict.get('one') : del myDict['one']
 print(myDict)  # {'two': 200, 'three': 300}

Or

We can do like following:

But one should keep in mind that, in this process actually it won’t delete any key from the dictionary rather than making specific key excluded from that dictionary. In addition, I observed that it returned a dictionary which was not ordered the same as myDict.

myDict = {'one': 100, 'two': 200, 'three': 300, 'four': 400, 'five': 500}
{key:value for key, value in myDict.items() if key != 'one'}

If we run it in the shell, it’ll execute something like {'five': 500, 'four': 400, 'three': 300, 'two': 200} – notice that it’s not the same ordered as myDict. Again if we try to print myDict, then we can see all keys including which we excluded from the dictionary by this approach. However, we can make a new dictionary by assigning the following statement into a variable:

var = {key:value for key, value in myDict.items() if key != 'one'}

Now if we try to print it, then it’ll follow the parent order:

print(var) # {'two': 200, 'three': 300, 'four': 400, 'five': 500}

Or

Using the pop() method.

myDict = {'one': 100, 'two': 200, 'three': 300}
print(myDict)

if myDict.get('one') : myDict.pop('one')
print(myDict)  # {'two': 200, 'three': 300}

The difference between del and pop is that, using pop() method, we can actually store the key’s value if needed, like the following:

myDict = {'one': 100, 'two': 200, 'three': 300}
if myDict.get('one') : var = myDict.pop('one')
print(myDict) # {'two': 200, 'three': 300}
print(var)    # 100

Fork this gist for future reference, if you find this useful.


回答 8

如果您想要非常冗长,可以使用异常处理:

try: 
    del dict[key]

except KeyError: pass

但是,pop()如果键不存在,这比方法要慢。

my_dict.pop('key', None)

几个键无关紧要,但是如果重复执行此操作,则后一种方法是更好的选择。

最快的方法是这样的:

if 'key' in dict: 
    del myDict['key']

但是此方法很危险,因为如果'key'在两行之间将其删除,KeyError则会引发a。

You can use exception handling if you want to be very verbose:

try: 
    del dict[key]

except KeyError: pass

This is slower, however, than the pop() method, if the key doesn’t exist.

my_dict.pop('key', None)

It won’t matter for a few keys, but if you’re doing this repeatedly, then the latter method is a better bet.

The fastest approach is this:

if 'key' in dict: 
    del myDict['key']

But this method is dangerous because if 'key' is removed in between the two lines, a KeyError will be raised.


回答 9

我更喜欢不变的版本

foo = {
    1:1,
    2:2,
    3:3
}
removeKeys = [1,2]
def woKeys(dct, keyIter):
    return {
        k:v
        for k,v in dct.items() if k not in keyIter
    }

>>> print(woKeys(foo, removeKeys))
{3: 3}
>>> print(foo)
{1: 1, 2: 2, 3: 3}

I prefer the immutable version

foo = {
    1:1,
    2:2,
    3:3
}
removeKeys = [1,2]
def woKeys(dct, keyIter):
    return {
        k:v
        for k,v in dct.items() if k not in keyIter
    }

>>> print(woKeys(foo, removeKeys))
{3: 3}
>>> print(foo)
{1: 1, 2: 2, 3: 3}

回答 10

另一种方法是通过使用items()+ dict理解

items()结合dict理解也可以帮助我们完成键-值对删除的任务,但是它具有不适合就地使用dict的缺点。实际上,如果创建了一个新字典,除了我们不希望包含的密钥之外。

test_dict = {"sai" : 22, "kiran" : 21, "vinod" : 21, "sangam" : 21} 

# Printing dictionary before removal 
print ("dictionary before performing remove is : " + str(test_dict)) 

# Using items() + dict comprehension to remove a dict. pair 
# removes  vinod
new_dict = {key:val for key, val in test_dict.items() if key != 'vinod'} 

# Printing dictionary after removal 
print ("dictionary after remove is : " + str(new_dict)) 

输出:

dictionary before performing remove is : {'sai': 22, 'kiran': 21, 'vinod': 21, 'sangam': 21}
dictionary after remove is : {'sai': 22, 'kiran': 21, 'sangam': 21}

Another way is by Using items() + dict comprehension

items() coupled with dict comprehension can also help us achieve task of key-value pair deletion but, it has drawback of not being an inplace dict technique. Actually a new dict if created except for the key we don’t wish to include.

test_dict = {"sai" : 22, "kiran" : 21, "vinod" : 21, "sangam" : 21} 

# Printing dictionary before removal 
print ("dictionary before performing remove is : " + str(test_dict)) 

# Using items() + dict comprehension to remove a dict. pair 
# removes  vinod
new_dict = {key:val for key, val in test_dict.items() if key != 'vinod'} 

# Printing dictionary after removal 
print ("dictionary after remove is : " + str(new_dict)) 

Output:

dictionary before performing remove is : {'sai': 22, 'kiran': 21, 'vinod': 21, 'sangam': 21}
dictionary after remove is : {'sai': 22, 'kiran': 21, 'sangam': 21}

回答 11

单键过滤

  • 如果my_dict中存在“ key”,则返回“ key”并将其从my_dict中删除
  • 如果my_dict中不存在“键”,则返回None

这将改变my_dict(可变)

my_dict.pop('key', None)

按键上有多个过滤器

生成一个新的字典(不可变的)

dic1 = {
    "x":1,
    "y": 2,
    "z": 3
}

def func1(item):
    return  item[0]!= "x" and item[0] != "y"

print(
    dict(
        filter(
            lambda item: item[0] != "x" and item[0] != "y", 
            dic1.items()
            )
    )
)

Single filter on key

  • return “key” and remove it from my_dict if “key” exists in my_dict
  • return None if “key” doesn’t exist in my_dict

this will change my_dict in place (mutable)

my_dict.pop('key', None)

Multiple filters on keys

generate a new dict (immutable)

dic1 = {
    "x":1,
    "y": 2,
    "z": 3
}

def func1(item):
    return  item[0]!= "x" and item[0] != "y"

print(
    dict(
        filter(
            lambda item: item[0] != "x" and item[0] != "y", 
            dic1.items()
            )
    )
)

从字典中删除元素

问题:从字典中删除元素

有没有办法从Python的字典中删除项目?

另外,如何从字典中删除项目以返回副本(即不修改原始内容)?

Is there a way to delete an item from a dictionary in Python?

Additionally, how can I delete an item from a dictionary to return a copy (i.e., not modifying the original)?


回答 0

del语句删除一个元素:

del d[key]

但是,这会使现有字典发生变化,因此对于引用同一实例的其他任何人,字典的内容都会更改。要返回词典,请复制该词典:

def removekey(d, key):
    r = dict(d)
    del r[key]
    return r

dict()构造使得浅拷贝。要进行深拷贝,请参阅copy模块


请注意,为每个字典del/ assignment / etc 复制一份。意味着您要从恒定时间变为线性时间,并且还要使用线性空间。对于小命令,这不是问题。但是,如果您打算复制大量大型字典,则可能需要不同的数据结构,例如HAMT(如本答案所述)。

The del statement removes an element:

del d[key]

However, this mutates the existing dictionary so the contents of the dictionary changes for anybody else who has a reference to the same instance. To return a new dictionary, make a copy of the dictionary:

def removekey(d, key):
    r = dict(d)
    del r[key]
    return r

The dict() constructor makes a shallow copy. To make a deep copy, see the copy module.


Note that making a copy for every dict del/assignment/etc. means you’re going from constant time to linear time, and also using linear space. For small dicts, this is not a problem. But if you’re planning to make lots of copies of large dicts, you probably want a different data structure, like a HAMT (as described in this answer).


回答 1

pop 使字典变异。

 >>> lol = {"hello": "gdbye"}
 >>> lol.pop("hello")
     'gdbye'
 >>> lol
     {}

如果您想保留原件,则可以将其复印。

pop mutates the dictionary.

 >>> lol = {"hello": "gdbye"}
 >>> lol.pop("hello")
     'gdbye'
 >>> lol
     {}

If you want to keep the original you could just copy it.


回答 2

我认为您的解决方案是最好的方法。但是,如果您需要其他解决方案,则可以使用旧字典中的键来创建新字典,而无需包括指定的键,如下所示:

>>> a
{0: 'zero', 1: 'one', 2: 'two', 3: 'three'}
>>> {i:a[i] for i in a if i!=0}
{1: 'one', 2: 'two', 3: 'three'}

I think your solution is best way to do it. But if you want another solution, you can create a new dictionary with using the keys from old dictionary without including your specified key, like this:

>>> a
{0: 'zero', 1: 'one', 2: 'two', 3: 'three'}
>>> {i:a[i] for i in a if i!=0}
{1: 'one', 2: 'two', 3: 'three'}

回答 3

del语句是你在找什么。如果您有一个名为foo的字典,其键名为“ bar”,则可以从foo中删除“ bar”,如下所示:

del foo['bar']

请注意,这将永久修改正在操作的词典。如果要保留原始词典,则必须事先创建一个副本:

>>> foo = {'bar': 'baz'}
>>> fu = dict(foo)
>>> del foo['bar']
>>> print foo
{}
>>> print fu
{'bar': 'baz'}

dict调用将进行浅表复制。如果要深拷贝,请使用copy.deepcopy

为了方便起见,您可以使用以下方法复制和粘贴:

def minus_key(key, dictionary):
    shallow_copy = dict(dictionary)
    del shallow_copy[key]
    return shallow_copy

The del statement is what you’re looking for. If you have a dictionary named foo with a key called ‘bar’, you can delete ‘bar’ from foo like this:

del foo['bar']

Note that this permanently modifies the dictionary being operated on. If you want to keep the original dictionary, you’ll have to create a copy beforehand:

>>> foo = {'bar': 'baz'}
>>> fu = dict(foo)
>>> del foo['bar']
>>> print foo
{}
>>> print fu
{'bar': 'baz'}

The dict call makes a shallow copy. If you want a deep copy, use copy.deepcopy.

Here’s a method you can copy & paste, for your convenience:

def minus_key(key, dictionary):
    shallow_copy = dict(dictionary)
    del shallow_copy[key]
    return shallow_copy

回答 4

有很多不错的答案,但我想强调一件事。

您可以同时使用dict.pop()方法和更通用的del语句从字典中删除项目。它们都变异了原始词典,因此您需要进行复制(请参见下面的详细信息)。

KeyError如果您要提供给他们的密钥在词典中不存在,则这两个都将引发一个:

key_to_remove = "c"
d = {"a": 1, "b": 2}
del d[key_to_remove]  # Raises `KeyError: 'c'`

key_to_remove = "c"
d = {"a": 1, "b": 2}
d.pop(key_to_remove)  # Raises `KeyError: 'c'`

您必须注意以下事项:

通过捕获异常:

key_to_remove = "c"
d = {"a": 1, "b": 2}
try:
    del d[key_to_remove]
except KeyError as ex:
    print("No such key: '%s'" % ex.message)

key_to_remove = "c"
d = {"a": 1, "b": 2}
try:
    d.pop(key_to_remove)
except KeyError as ex:
    print("No such key: '%s'" % ex.message)

通过执行检查:

key_to_remove = "c"
d = {"a": 1, "b": 2}
if key_to_remove in d:
    del d[key_to_remove]

key_to_remove = "c"
d = {"a": 1, "b": 2}
if key_to_remove in d:
    d.pop(key_to_remove)

但是pop()还有一种更简洁的方法-提供默认的返回值:

key_to_remove = "c"
d = {"a": 1, "b": 2}
d.pop(key_to_remove, None)  # No `KeyError` here

除非您pop()用来获取要删除的键的值,否则可以提供任何必要的信息None。虽然可能由于with函数本身具有复杂性而导致开销,所以delwith incheck的使用快一些pop()。通常情况并非如此,因此pop()使用默认值就足够了。


对于主要问题,您必须复制字典,以保存原始字典,并在不删除密钥的情况下新建一个字典。

这里的其他一些人建议使用进行完整(较深)的副本copy.deepcopy(),这可能是一个过大的杀伤力,而使用copy.copy()或则dict.copy()可能是“正常”(较浅)的副本,可能就足够了。字典保留对对象的引用作为键的值。因此,当您从字典中删除键时,该引用将被删除,而不是被引用的对象。如果内存中没有其他引用,则垃圾回收器随后可以自动删除该对象本身。与浅拷贝相比,进行深拷贝需要更多的计算,因此,通过进行深拷贝,浪费内存并为GC提供更多工作,它会降低代码性能,有时浅拷贝就足够了。

但是,如果您将可变对象作为字典值,并计划以后在不带键的情况下在返回的字典中对其进行修改,则必须进行深拷贝。

使用浅拷贝:

def get_dict_wo_key(dictionary, key):
    """Returns a **shallow** copy of the dictionary without a key."""
    _dict = dictionary.copy()
    _dict.pop(key, None)
    return _dict


d = {"a": [1, 2, 3], "b": 2, "c": 3}
key_to_remove = "c"

new_d = get_dict_wo_key(d, key_to_remove)
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3], "b": 2}
new_d["a"].append(100)
print(d)  # {"a": [1, 2, 3, 100], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2}
new_d["b"] = 2222
print(d)  # {"a": [1, 2, 3, 100], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2222}

使用深拷贝:

from copy import deepcopy


def get_dict_wo_key(dictionary, key):
    """Returns a **deep** copy of the dictionary without a key."""
    _dict = deepcopy(dictionary)
    _dict.pop(key, None)
    return _dict


d = {"a": [1, 2, 3], "b": 2, "c": 3}
key_to_remove = "c"

new_d = get_dict_wo_key(d, key_to_remove)
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3], "b": 2}
new_d["a"].append(100)
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2}
new_d["b"] = 2222
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2222}

There’re a lot of nice answers, but I want to emphasize one thing.

You can use both dict.pop() method and a more generic del statement to remove items from a dictionary. They both mutate the original dictionary, so you need to make a copy (see details below).

And both of them will raise a KeyError if the key you’re providing to them is not present in the dictionary:

key_to_remove = "c"
d = {"a": 1, "b": 2}
del d[key_to_remove]  # Raises `KeyError: 'c'`

and

key_to_remove = "c"
d = {"a": 1, "b": 2}
d.pop(key_to_remove)  # Raises `KeyError: 'c'`

You have to take care of this:

by capturing the exception:

key_to_remove = "c"
d = {"a": 1, "b": 2}
try:
    del d[key_to_remove]
except KeyError as ex:
    print("No such key: '%s'" % ex.message)

and

key_to_remove = "c"
d = {"a": 1, "b": 2}
try:
    d.pop(key_to_remove)
except KeyError as ex:
    print("No such key: '%s'" % ex.message)

by performing a check:

key_to_remove = "c"
d = {"a": 1, "b": 2}
if key_to_remove in d:
    del d[key_to_remove]

and

key_to_remove = "c"
d = {"a": 1, "b": 2}
if key_to_remove in d:
    d.pop(key_to_remove)

but with pop() there’s also a much more concise way – provide the default return value:

key_to_remove = "c"
d = {"a": 1, "b": 2}
d.pop(key_to_remove, None)  # No `KeyError` here

Unless you use pop() to get the value of a key being removed you may provide anything, not necessary None. Though it might be that using del with in check is slightly faster due to pop() being a function with its own complications causing overhead. Usually it’s not the case, so pop() with default value is good enough.


As for the main question, you’ll have to make a copy of your dictionary, to save the original dictionary and have a new one without the key being removed.

Some other people here suggest making a full (deep) copy with copy.deepcopy(), which might be an overkill, a “normal” (shallow) copy, using copy.copy() or dict.copy(), might be enough. The dictionary keeps a reference to the object as a value for a key. So when you remove a key from a dictionary this reference is removed, not the object being referenced. The object itself may be removed later automatically by the garbage collector, if there’re no other references for it in the memory. Making a deep copy requires more calculations compared to shallow copy, so it decreases code performance by making the copy, wasting memory and providing more work to the GC, sometimes shallow copy is enough.

However, if you have mutable objects as dictionary values and plan to modify them later in the returned dictionary without the key, you have to make a deep copy.

With shallow copy:

def get_dict_wo_key(dictionary, key):
    """Returns a **shallow** copy of the dictionary without a key."""
    _dict = dictionary.copy()
    _dict.pop(key, None)
    return _dict


d = {"a": [1, 2, 3], "b": 2, "c": 3}
key_to_remove = "c"

new_d = get_dict_wo_key(d, key_to_remove)
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3], "b": 2}
new_d["a"].append(100)
print(d)  # {"a": [1, 2, 3, 100], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2}
new_d["b"] = 2222
print(d)  # {"a": [1, 2, 3, 100], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2222}

With deep copy:

from copy import deepcopy


def get_dict_wo_key(dictionary, key):
    """Returns a **deep** copy of the dictionary without a key."""
    _dict = deepcopy(dictionary)
    _dict.pop(key, None)
    return _dict


d = {"a": [1, 2, 3], "b": 2, "c": 3}
key_to_remove = "c"

new_d = get_dict_wo_key(d, key_to_remove)
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3], "b": 2}
new_d["a"].append(100)
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2}
new_d["b"] = 2222
print(d)  # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d)  # {"a": [1, 2, 3, 100], "b": 2222}

回答 5

…如何从字典中删除项目以返回副本(即不修改原始内容)?

A dict是用于此的错误数据结构。

当然,复制dict并从复制中弹出是可行的,利用理解力构建新dict也是如此,但是所有复制都需要时间-您已经用线性时间操作替换了恒定时间操作。并且所有这些活着的副本立刻占据了空间-每个副本的线性空间。

其他数据结构(例如哈希数组映射尝试)也正是针对这种用例而设计的:添加或删除元素会以对数时间返回一个副本并将其大部分存储与原始共享1个

当然也有一些缺点。性能是对数而不是常数(尽管基数较大,通常为32-128)。而且,尽管您可以使非变异API与相同dict,但“变异” API显然是不同的。而且,最重要的是,Python不附带HAMT电池。2

pyrsistent库是基于HAMT的dict-replacement(以及各种其他类型)的Python相当可靠的实现。它甚至还有一个漂亮的Evolutioner API,用于尽可能平滑地将现有的变异代码移植到持久代码中。但是,如果您想明确地表示要返回副本而不是进行变异,则可以像这样使用它:

>>> from pyrsistent import m
>>> d1 = m(a=1, b=2)
>>> d2 = d1.set('c', 3)
>>> d3 = d1.remove('a')
>>> d1
pmap({'a': 1, 'b': 2})
>>> d2
pmap({'c': 3, 'a': 1, 'b': 2})
>>> d3
pmap({'b': 2})

d3 = d1.remove('a')正是这个问题所要的。

如果您有可变的数据结构(例如)dictlist嵌入到中pmap,则仍然会出现别名问题-您只能通过将pmaps和pvectors 嵌入所有位置来实现不可变,以解决此问题。


1. HAMT在Scala,Clojure和Haskell等语言中也很流行,因为它们在无锁编程和软件事务存储中的表现非常好,但是在Python中它们都不重要。

2.事实上,在一个STDLIB HAMT,在执行中使用contextvars较早撤消的PEP解释了原因。但这是库的隐藏实现细节,而不是公共集合类型。

… how can I delete an item from a dictionary to return a copy (i.e., not modifying the original)?

A dict is the wrong data structure to use for this.

Sure, copying the dict and popping from the copy works, and so does building a new dict with a comprehension, but all that copying takes time—you’ve replaced a constant-time operation with a linear-time one. And all those copies alive at once take space—linear space per copy.

Other data structures, like hash array mapped tries, are designed for exactly this kind of use case: adding or removing an element returns a copy in logarithmic time, sharing most of its storage with the original.1

Of course there are some downsides. Performance is logarithmic rather than constant (although with a large base, usually 32-128). And, while you can make the non-mutating API identical to dict, the “mutating” API is obviously different. And, most of all, there’s no HAMT batteries included with Python.2

The pyrsistent library is a pretty solid implementation of HAMT-based dict-replacements (and various other types) for Python. It even has a nifty evolver API for porting existing mutating code to persistent code as smoothly as possible. But if you want to be explicit about returning copies rather than mutating, you just use it like this:

>>> from pyrsistent import m
>>> d1 = m(a=1, b=2)
>>> d2 = d1.set('c', 3)
>>> d3 = d1.remove('a')
>>> d1
pmap({'a': 1, 'b': 2})
>>> d2
pmap({'c': 3, 'a': 1, 'b': 2})
>>> d3
pmap({'b': 2})

That d3 = d1.remove('a') is exactly what the question is asking for.

If you’ve got mutable data structures like dict and list embedded in the pmap, you’ll still have aliasing issues—you can only fix that by going immutable all the way down, embedding pmaps and pvectors.


1. HAMTs have also become popular in languages like Scala, Clojure, Haskell because they play very nicely with lock-free programming and software transactional memory, but neither of those is very relevant in Python.

2. In fact, there is an HAMT in the stdlib, used in the implementation of contextvars. The earlier withdrawn PEP explains why. But this is a hidden implementation detail of the library, not a public collection type.


回答 6

d = {1: 2, '2': 3, 5: 7}
del d[5]
print 'd = ', d

结果: d = {1: 2, '2': 3}

d = {1: 2, '2': 3, 5: 7}
del d[5]
print 'd = ', d

Result: d = {1: 2, '2': 3}


回答 7

只需调用del d [‘key’]。

但是,在生产中,始终最好检查d中是否存在“密钥”。

if 'key' in d:
    del d['key']

Simply call del d[‘key’].

However, in production, it is always a good practice to check if ‘key’ exists in d.

if 'key' in d:
    del d['key']

回答 8

不,除了

def dictMinus(dct, val):
   copy = dct.copy()
   del copy[val]
   return copy

但是,通常仅创建略有变化的字典的副本可能不是一个好主意,因为这将导致相对较大的内存需求。通常最好记录旧字典(如果需要的话),然后对其进行修改。

No, there is no other way than

def dictMinus(dct, val):
   copy = dct.copy()
   del copy[val]
   return copy

However, often creating copies of only slightly altered dictionaries is probably not a good idea because it will result in comparatively large memory demands. It is usually better to log the old dictionary(if even necessary) and then modify it.


回答 9

# mutate/remove with a default
ret_val = body.pop('key', 5)
# no mutation with a default
ret_val = body.get('key', 5)
# mutate/remove with a default
ret_val = body.pop('key', 5)
# no mutation with a default
ret_val = body.get('key', 5)

回答 10

>>> def delete_key(dict, key):
...     del dict[key]
...     return dict
... 
>>> test_dict = {'one': 1, 'two' : 2}
>>> print delete_key(test_dict, 'two')
{'one': 1}
>>>

这不会进行任何错误处理,它假定键在字典中,您可能需要先检查一下,raise如果没有

>>> def delete_key(dict, key):
...     del dict[key]
...     return dict
... 
>>> test_dict = {'one': 1, 'two' : 2}
>>> print delete_key(test_dict, 'two')
{'one': 1}
>>>

this doesn’t do any error handling, it assumes the key is in the dict, you might want to check that first and raise if its not


回答 11

这里是一种顶层设计方法:

def eraseElement(d,k):
    if isinstance(d, dict):
        if k in d:
            d.pop(k)
            print(d)
        else:
            print("Cannot find matching key")
    else:
        print("Not able to delete")


exp = {'A':34, 'B':55, 'C':87}
eraseElement(exp, 'C')

我正在将字典和想要的键传递到函数中,验证它是否是字典,并且键是否还可以,如果两者都存在,则从字典中删除值并打印出剩余的值。

输出: {'B': 55, 'A': 34}

希望有帮助!

Here a top level design approach:

def eraseElement(d,k):
    if isinstance(d, dict):
        if k in d:
            d.pop(k)
            print(d)
        else:
            print("Cannot find matching key")
    else:
        print("Not able to delete")


exp = {'A':34, 'B':55, 'C':87}
eraseElement(exp, 'C')

I’m passing the dictionary and the key I want into my function, validates if it’s a dictionary and if the key is okay, and if both exist, removes the value from the dictionary and prints out the left-overs.

Output: {'B': 55, 'A': 34}

Hope that helps!


回答 12

下面的代码段绝对会帮助您,我在每一行中添加了注释,这将有助于您理解代码。

def execute():
   dic = {'a':1,'b':2}
   dic2 = remove_key_from_dict(dic, 'b')  
   print(dict2)           # {'a': 1}
   print(dict)            # {'a':1,'b':2}

def remove_key_from_dict(dictionary_to_use, key_to_delete):
   copy_of_dict = dict(dictionary_to_use)     # creating clone/copy of the dictionary
   if key_to_delete in copy_of_dict :         # checking given key is present in the dictionary
       del copy_of_dict [key_to_delete]       # deleting the key from the dictionary 
   return copy_of_dict                        # returning the final dictionary

或者你也可以使用dict.pop()

d = {"a": 1, "b": 2}

res = d.pop("c")  # No `KeyError` here
print (res)       # this line will not execute

或更好的方法是

res = d.pop("c", "key not found")
print (res)   # key not found
print (d)     # {"a": 1, "b": 2}

res = d.pop("b", "key not found")
print (res)   # 2
print (d)     # {"a": 1}

Below code snippet will help you definitely, I have added comments in each line which will help you in understanding the code.

def execute():
   dic = {'a':1,'b':2}
   dic2 = remove_key_from_dict(dic, 'b')  
   print(dict2)           # {'a': 1}
   print(dict)            # {'a':1,'b':2}

def remove_key_from_dict(dictionary_to_use, key_to_delete):
   copy_of_dict = dict(dictionary_to_use)     # creating clone/copy of the dictionary
   if key_to_delete in copy_of_dict :         # checking given key is present in the dictionary
       del copy_of_dict [key_to_delete]       # deleting the key from the dictionary 
   return copy_of_dict                        # returning the final dictionary

or you can also use dict.pop()

d = {"a": 1, "b": 2}

res = d.pop("c")  # No `KeyError` here
print (res)       # this line will not execute

or the better approach is

res = d.pop("c", "key not found")
print (res)   # key not found
print (d)     # {"a": 1, "b": 2}

res = d.pop("b", "key not found")
print (res)   # 2
print (d)     # {"a": 1}

回答 13

这是使用列表理解的另一个变体:

original_d = {'a': None, 'b': 'Some'}
d = dict((k,v) for k, v in original_d.iteritems() if v)
# result should be {'b': 'Some'}

该方法基于本文的答案:一种 有效的方法来从字典中删除带有空字符串的键

Here’s another variation using list comprehension:

original_d = {'a': None, 'b': 'Some'}
d = dict((k,v) for k, v in original_d.iteritems() if v)
# result should be {'b': 'Some'}

The approach is based on an answer from this post: Efficient way to remove keys with empty strings from a dict


回答 14

    species = {'HI': {'1': (1215.671, 0.41600000000000004),
  '10': (919.351, 0.0012),
  '1025': (1025.722, 0.0791),
  '11': (918.129, 0.0009199999999999999),
  '12': (917.181, 0.000723),
  '1215': (1215.671, 0.41600000000000004),
  '13': (916.429, 0.0005769999999999999),
  '14': (915.824, 0.000468),
  '15': (915.329, 0.00038500000000000003),
 'CII': {'1036': (1036.3367, 0.11900000000000001), '1334': (1334.532, 0.129)}}

以下代码将复制字典species并删除不在目录中的项目trans_HI

trans_HI=['1025','1215']
for transition in species['HI'].copy().keys():
    if transition not in trans_HI:
        species['HI'].pop(transition)
    species = {'HI': {'1': (1215.671, 0.41600000000000004),
  '10': (919.351, 0.0012),
  '1025': (1025.722, 0.0791),
  '11': (918.129, 0.0009199999999999999),
  '12': (917.181, 0.000723),
  '1215': (1215.671, 0.41600000000000004),
  '13': (916.429, 0.0005769999999999999),
  '14': (915.824, 0.000468),
  '15': (915.329, 0.00038500000000000003),
 'CII': {'1036': (1036.3367, 0.11900000000000001), '1334': (1334.532, 0.129)}}

The following code will make a copy of dict species and delete items which are not in trans_HI

trans_HI=['1025','1215']
for transition in species['HI'].copy().keys():
    if transition not in trans_HI:
        species['HI'].pop(transition)

创建具有列表理解的字典

问题:创建具有列表理解的字典

我喜欢Python列表理解语法。

它也可以用来创建字典吗?例如,通过遍历键和值对:

mydict = {(k,v) for (k,v) in blah blah blah}  # doesn't work

I like the Python list comprehension syntax.

Can it be used to create dictionaries too? For example, by iterating over pairs of keys and values:

mydict = {(k,v) for (k,v) in blah blah blah}  # doesn't work

回答 0

从Python 2.7和3开始,您应该只使用dict comprehension语法

{key: value for (key, value) in iterable}

在Python 2.6和更早版本中,dict内置函数可以接收键/值对的迭代,因此您可以将其传递给列表推导或生成器表达式。例如:

dict((key, func(key)) for key in keys)

但是,如果您已经具有可迭代的键和/或值,则根本不需要使用任何理解-最简单dict的方法是直接调用内置函数:

# consumed from any iterable yielding pairs of keys/vals
dict(pairs)

# "zipped" from two separate iterables of keys/vals
dict(zip(list_of_keys, list_of_values))

From Python 2.7 and 3 onwards, you should just use the dict comprehension syntax:

{key: value for (key, value) in iterable}

In Python 2.6 and earlier, the dict built-in can receive an iterable of key/value pairs, so you can pass it a list comprehension or generator expression. For example:

dict((key, func(key)) for key in keys)

However if you already have iterable(s) of keys and/or vals, you needn’t use a comprehension at all – it’s simplest just call the dict built-in directly:

# consumed from any iterable yielding pairs of keys/vals
dict(pairs)

# "zipped" from two separate iterables of keys/vals
dict(zip(list_of_keys, list_of_values))

回答 1

在Python 3和Python 2.7+中,字典理解如下所示:

d = {k:v for k, v in iterable}

对于Python 2.6或更早版本,请参见fortran的答案

In Python 3 and Python 2.7+, dictionary comprehensions look like the below:

d = {k:v for k, v in iterable}

For Python 2.6 or earlier, see fortran’s answer.


回答 2

实际上,如果它已经包含某种映射,则您甚至不需要遍历可迭代对象,dict构造函数会为您轻松地做到这一点:

>>> ts = [(1, 2), (3, 4), (5, 6)]
>>> dict(ts)
{1: 2, 3: 4, 5: 6}
>>> gen = ((i, i+1) for i in range(1, 6, 2))
>>> gen
<generator object <genexpr> at 0xb7201c5c>
>>> dict(gen)
{1: 2, 3: 4, 5: 6}

In fact, you don’t even need to iterate over the iterable if it already comprehends some kind of mapping, the dict constructor doing it graciously for you:

>>> ts = [(1, 2), (3, 4), (5, 6)]
>>> dict(ts)
{1: 2, 3: 4, 5: 6}
>>> gen = ((i, i+1) for i in range(1, 6, 2))
>>> gen
<generator object <genexpr> at 0xb7201c5c>
>>> dict(gen)
{1: 2, 3: 4, 5: 6}

回答 3

在Python 2.7中,它类似于:

>>> list1, list2 = ['a', 'b', 'c'], [1,2,3]
>>> dict( zip( list1, list2))
{'a': 1, 'c': 3, 'b': 2}

压缩他们

In Python 2.7, it goes like:

>>> list1, list2 = ['a', 'b', 'c'], [1,2,3]
>>> dict( zip( list1, list2))
{'a': 1, 'c': 3, 'b': 2}

Zip them!


回答 4

在Python中创建具有列表理解的字典

我喜欢Python列表理解语法。

它也可以用来创建字典吗?例如,通过遍历键和值对:

mydict = {(k,v) for (k,v) in blah blah blah}

您正在寻找“ dict comprehension”一词-实际上是:

mydict = {k: v for k, v in iterable}

假设blah blah blah是两个元组的迭代-您是如此亲密。让我们创建一些类似的“ blah”:

blahs = [('blah0', 'blah'), ('blah1', 'blah'), ('blah2', 'blah'), ('blah3', 'blah')]

Dict理解语法:

现在,这里的语法是映射部分。使它成为dict理解而不是set理解(这是您的伪代码近似值)的是冒号,:如下所示:

mydict = {k: v for k, v in blahs}

而且我们看到它起作用了,并且应该保留Python 3.7的插入顺序:

>>> mydict
{'blah0': 'blah', 'blah1': 'blah', 'blah2': 'blah', 'blah3': 'blah'}

在Python 2和3.6以下版本中,不能保证顺序:

>>> mydict
{'blah0': 'blah', 'blah1': 'blah', 'blah3': 'blah', 'blah2': 'blah'}

添加过滤器:

所有的理解都具有一个映射组件和一个过滤组件,您可以为它们提供任意表达式。

因此,您可以在末尾添加一个过滤器部分:

>>> mydict = {k: v for k, v in blahs if not int(k[-1]) % 2}
>>> mydict
{'blah0': 'blah', 'blah2': 'blah'}

在这里,我们只是测试最后符是否可被2整除以在映射键和值之前过滤掉数据。

Create a dictionary with list comprehension in Python

I like the Python list comprehension syntax.

Can it be used to create dictionaries too? For example, by iterating over pairs of keys and values:

mydict = {(k,v) for (k,v) in blah blah blah}

You’re looking for the phrase “dict comprehension” – it’s actually:

mydict = {k: v for k, v in iterable}

Assuming blah blah blah is an iterable of two-tuples – you’re so close. Let’s create some “blahs” like that:

blahs = [('blah0', 'blah'), ('blah1', 'blah'), ('blah2', 'blah'), ('blah3', 'blah')]

Dict comprehension syntax:

Now the syntax here is the mapping part. What makes this a dict comprehension instead of a set comprehension (which is what your pseudo-code approximates) is the colon, : like below:

mydict = {k: v for k, v in blahs}

And we see that it worked, and should retain insertion order as-of Python 3.7:

>>> mydict
{'blah0': 'blah', 'blah1': 'blah', 'blah2': 'blah', 'blah3': 'blah'}

In Python 2 and up to 3.6, order was not guaranteed:

>>> mydict
{'blah0': 'blah', 'blah1': 'blah', 'blah3': 'blah', 'blah2': 'blah'}

Adding a Filter:

All comprehensions feature a mapping component and a filtering component that you can provide with arbitrary expressions.

So you can add a filter part to the end:

>>> mydict = {k: v for k, v in blahs if not int(k[-1]) % 2}
>>> mydict
{'blah0': 'blah', 'blah2': 'blah'}

Here we are just testing for if the last character is divisible by 2 to filter out data before mapping the keys and values.


回答 5

Python版本<2.7 (RIP,2010年7月3日至2019年12月31日),请执行以下操作:

d = dict((i,True) for i in [1,2,3])

Python版本> = 2.7,请执行以下操作:

d = {i: True for i in [1,2,3]}

Python version < 2.7(RIP, 3 July 2010 – 31 December 2019), do the below:

d = dict((i,True) for i in [1,2,3])

Python version >= 2.7, do the below:

d = {i: True for i in [1,2,3]}

回答 6

如果要遍历键key_list列表和值列表,请添加到@fortran的答案中value_list

d = dict((key, value) for (key, value) in zip(key_list, value_list))

要么

d = {(key, value) for (key, value) in zip(key_list, value_list)}

To add onto @fortran’s answer, if you want to iterate over a list of keys key_list as well as a list of values value_list:

d = dict((key, value) for (key, value) in zip(key_list, value_list))

or

d = {(key, value) for (key, value) in zip(key_list, value_list)}

回答 7

这是使用dict理解创建字典的另一个示例:

我在这里要做的是在每对字典中创建一个字母字典。是英文字母及其在英文字母中的对应位置

>>> import string
>>> dict1 = {value: (int(key) + 1) for key, value in 
enumerate(list(string.ascii_lowercase))}
>>> dict1
{'a': 1, 'c': 3, 'b': 2, 'e': 5, 'd': 4, 'g': 7, 'f': 6, 'i': 9, 'h': 8, 
'k': 11, 'j': 10, 'm': 13, 'l': 12, 'o': 15, 'n': 14, 'q': 17, 'p': 16, 's': 
19, 'r': 18, 'u': 21, 't': 20, 'w': 23, 'v': 22, 'y': 25, 'x': 24, 'z': 26}
>>> 

请注意,此处使用枚举可获取列表中的字母及其索引,并交换字母和索引以生成字典的键值对

希望它对您的字典组合有个好主意,并鼓励您更多地使用它来使代码紧凑

Here is another example of dictionary creation using dict comprehension:

What i am tring to do here is to create a alphabet dictionary where each pair; is the english letter and its corresponding position in english alphabet

>>> import string
>>> dict1 = {value: (int(key) + 1) for key, value in 
enumerate(list(string.ascii_lowercase))}
>>> dict1
{'a': 1, 'c': 3, 'b': 2, 'e': 5, 'd': 4, 'g': 7, 'f': 6, 'i': 9, 'h': 8, 
'k': 11, 'j': 10, 'm': 13, 'l': 12, 'o': 15, 'n': 14, 'q': 17, 'p': 16, 's': 
19, 'r': 18, 'u': 21, 't': 20, 'w': 23, 'v': 22, 'y': 25, 'x': 24, 'z': 26}
>>> 

Notice the use of enumerate here to get a list of alphabets and their indexes in the list and swapping the alphabets and indices to generate the key value pair for dictionary

Hope it gives a good idea of dictionary comp to you and encourages you to use it more often to make your code compact


回答 8

尝试这个,

def get_dic_from_two_lists(keys, values):
    return { keys[i] : values[i] for i in range(len(keys)) }

假设我们有两个清单,国家首都

country = ['India', 'Pakistan', 'China']
capital = ['New Delhi', 'Islamabad', 'Beijing']

然后从两个列表中创建字典:

print get_dic_from_two_lists(country, capital)

输出是这样的,

{'Pakistan': 'Islamabad', 'China': 'Beijing', 'India': 'New Delhi'}

Try this,

def get_dic_from_two_lists(keys, values):
    return { keys[i] : values[i] for i in range(len(keys)) }

Assume we have two lists country and capital

country = ['India', 'Pakistan', 'China']
capital = ['New Delhi', 'Islamabad', 'Beijing']

Then create dictionary from the two lists:

print get_dic_from_two_lists(country, capital)

The output is like this,

{'Pakistan': 'Islamabad', 'China': 'Beijing', 'India': 'New Delhi'}

回答 9

>>> {k: v**3 for (k, v) in zip(string.ascii_lowercase, range(26))}

Python支持dict理解,它允许您使用类似的简洁语法在运行时表示字典的创建。

字典理解采用{key:(key,value)inerable中的值}的形式。该语法是在Python 3中引入的,并且一直移植到Python 2.7,因此,无论安装了哪个版本的Python,您都应该能够使用它。

一个典型的例子是获取两个列表并创建一个字典,其中第一个列表中每个位置的项成为键,而第二个列表中相应位置的项变为值。

此推导中使用的zip函数返回一个元组的迭代器,其中元组中的每个元素均取自每个输入可迭代对象中的相同位置。在上面的示例中,返回的迭代器包含元组(“ a”,1),(“ b”,2)等。

输出:

{‘i’:512,’e’:64,’o’:2744,’h’:343,’l’:1331,’s’:5832,’b’:1,’w’:10648,’ c’:8,’x’:12167,’y’:13824,’t’:6859,’p’:3375,’d’:27,’j’:729,’a’:0,’z’ :15625,’f’:125,’q’:4096,’u’:8000,’n’:2197,’m’:1728,’r’:4913,’k’:1000,’g’:216 ,’v’:9261}

>>> {k: v**3 for (k, v) in zip(string.ascii_lowercase, range(26))}

Python supports dict comprehensions, which allow you to express the creation of dictionaries at runtime using a similarly concise syntax.

A dictionary comprehension takes the form {key: value for (key, value) in iterable}. This syntax was introduced in Python 3 and backported as far as Python 2.7, so you should be able to use it regardless of which version of Python you have installed.

A canonical example is taking two lists and creating a dictionary where the item at each position in the first list becomes a key and the item at the corresponding position in the second list becomes the value.

The zip function used inside this comprehension returns an iterator of tuples, where each element in the tuple is taken from the same position in each of the input iterables. In the example above, the returned iterator contains the tuples (“a”, 1), (“b”, 2), etc.

Output:

{‘i’: 512, ‘e’: 64, ‘o’: 2744, ‘h’: 343, ‘l’: 1331, ‘s’: 5832, ‘b’: 1, ‘w’: 10648, ‘c’: 8, ‘x’: 12167, ‘y’: 13824, ‘t’: 6859, ‘p’: 3375, ‘d’: 27, ‘j’: 729, ‘a’: 0, ‘z’: 15625, ‘f’: 125, ‘q’: 4096, ‘u’: 8000, ‘n’: 2197, ‘m’: 1728, ‘r’: 4913, ‘k’: 1000, ‘g’: 216, ‘v’: 9261}


回答 10

此代码将使用列表推导为多个具有不同值的列表创建字典,这些字典可用于 pd.DataFrame()

#Multiple lists 
model=['A', 'B', 'C', 'D']
launched=[1983,1984,1984,1984]
discontinued=[1986, 1985, 1984, 1986]

#Dictionary with list comprehension
keys=['model','launched','discontinued']
vals=[model, launched,discontinued]
data = {key:vals[n] for n, key in enumerate(keys)}

enumerate将通过nvals使其与key列表匹配

This code will create dictionary using list comprehension for multiple lists with different values that can be used for pd.DataFrame()

#Multiple lists 
model=['A', 'B', 'C', 'D']
launched=[1983,1984,1984,1984]
discontinued=[1986, 1985, 1984, 1986]

#Dictionary with list comprehension
keys=['model','launched','discontinued']
vals=[model, launched,discontinued]
data = {key:vals[n] for n, key in enumerate(keys)}

enumerate will pass n to vals to match each key with its list


回答 11

仅举另一个例子。假设您有以下列表:

nums = [4,2,2,1,3]

并且您想将其变成字典,其中键是索引,值是列表中的元素。您可以使用以下代码行执行此操作:

{index:nums[index] for index in range(0,len(nums))}

Just to throw in another example. Imagine you have the following list:

nums = [4,2,2,1,3]

and you want to turn it into a dict where the key is the index and value is the element in the list. You can do so with the following line of code:

{index:nums[index] for index in range(0,len(nums))}

回答 12

您可以为每对创建一个新的字典并将其与上一个字典合并:

reduce(lambda p, q: {**p, **{q[0]: q[1]}}, bla bla bla, {})

显然,这种方法需要reduce来自functools

You can create a new dict for each pair and merge it with the previous dict:

reduce(lambda p, q: {**p, **{q[0]: q[1]}}, bla bla bla, {})

Obviously this approaches requires reduce from functools.


将两个列表转换成字典

问题:将两个列表转换成字典

想象一下您有:

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

产生以下字典的最简单方法是什么?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

Imagine that you have:

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

What is the simplest way to produce the following dictionary?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

回答 0

像这样:

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

Voila :-)成对的dict构造函数和zip函数非常有用:https//docs.python.org/3/library/functions.html#func-dict

Like this:

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

Voila :-) The pairwise dict constructor and zip function are awesomely useful: https://docs.python.org/3/library/functions.html#func-dict


回答 1

想象一下您有:

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

产生以下字典的最简单方法是什么?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

绩效最高的dict构造函数zip

new_dict = dict(zip(keys, values))

在Python 3中,zip现在返回一个惰性迭代器,这是目前性能最高的方法。

dict(zip(keys, values))确实需要为dict和进行一次性全局查找zip,但它不会形成任何不必要的中间数据结构,也不必在函数应用程序中处理局部查找。

亚军,dict理解:

使用dict构造函数的第二名是使用dict理解的本机语法(而不是列表理解,因为其他人错误地将其理解为):

new_dict = {k: v for k, v in zip(keys, values)}

当您需要根据键或值进行映射或过滤时选择此选项。

在Python 2中,zip返回一个列表,以避免创建不必要的列表,请izip改用(别名为zip可以减少移至Python 3时的代码更改)。

from itertools import izip as zip

所以仍然是(2.7):

new_dict = {k: v for k, v in zip(keys, values)}

Python 2,非常适合<= 2.6

izipitertools变为zip在Python 3. izip大于拉链用于Python 2更好(因为它避免了不必要的列表创建),以及理想的2.6或以下:

from itertools import izip
new_dict = dict(izip(keys, values))

所有情况的结果:

在所有情况下:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

说明:

如果我们查看帮助,dict就会发现它采用了多种形式的参数:


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

最佳方法是使用可迭代对象,同时避免创建不必要的数据结构。在Python 2中,zip创建了不必要的列表:

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

在Python 3中,等效项为:

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

Python 3 zip仅创建了一个可迭代的对象:

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

由于我们要避免创建不必要的数据结构,因此我们通常希望避免使用Python 2 zip(因为它创建了不必要的列表)。

性能较差的替代品:

这是一个传递给dict构造函数的生成器表达式:

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

或等效地:

dict((k, v) for k, v in zip(keys, values))

这是一个传递给dict构造函数的列表理解:

dict([(k, v) for k, v in zip(keys, values)])

在前两种情况下,在可迭代的zip上放置了一个额外的非操作(因此是不必要的)计算层,并且在列表理解的情况下,不必要地创建了一个额外的列表。我希望他们所有人的表现都不太好,当然也不会那么好。

绩效考核:

在Ubuntu 16.04上,由Nix提供的64位Python 3.8.2中,从最快到最慢的顺序是:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>> 
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583

dict(zip(keys, values)) 即使使用少量键和值也能获胜,但对于较大的键和值,则性能差异会更大。

评论者说:

min似乎是比较效果的一种坏方法。当然mean和/或max将是更有用的实际使用指标。

我们使用min这些算法是因为它们是确定性的。我们想知道算法在最佳条件下的性能。

如果操作系统由于任何原因挂起,则与我们要比较的内容无关,因此我们需要从分析中排除这些结果。

如果使用mean,这些事件将大大扭曲我们的结果,而如果使用,max我们将只会得到最极端的结果-最有可能受此类事件影响的结果。

评论者还说:

在python 3.6.8中,使用平均值,对dict的理解确实仍然更快,对于这些小列表而言,大约30%。对于较大的列表(10k个随机数),dict通话速度大约快10%。

我想我们的意思是dict(zip(...10k随机数。听起来确实是一个非常不寻常的用例。确实有道理,最直接的调用将在大型数据集中占主导地位,并且考虑到运行该测试将花费多长时间,进而使您的数字发生偏差,如果操作系统挂起占主导地位,我也不会感到惊讶。如果您使用meanmax我会认为您的结果毫无意义。

让我们在上面的示例中使用更实际的尺寸:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

而且我们在这里看到,dict(zip(...对于较大的数据集,确实可以更快地运行约20%。

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

Imagine that you have:

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

What is the simplest way to produce the following dictionary ?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

Most performant, dict constructor with zip

new_dict = dict(zip(keys, values))

In Python 3, zip now returns a lazy iterator, and this is now the most performant approach.

dict(zip(keys, values)) does require the one-time global lookup each for dict and zip, but it doesn’t form any unnecessary intermediate data-structures or have to deal with local lookups in function application.

Runner-up, dict comprehension:

A close runner-up to using the dict constructor is to use the native syntax of a dict comprehension (not a list comprehension, as others have mistakenly put it):

new_dict = {k: v for k, v in zip(keys, values)}

Choose this when you need to map or filter based on the keys or value.

In Python 2, zip returns a list, to avoid creating an unnecessary list, use izip instead (aliased to zip can reduce code changes when you move to Python 3).

from itertools import izip as zip

So that is still (2.7):

new_dict = {k: v for k, v in zip(keys, values)}

Python 2, ideal for <= 2.6

izip from itertools becomes zip in Python 3. izip is better than zip for Python 2 (because it avoids the unnecessary list creation), and ideal for 2.6 or below:

from itertools import izip
new_dict = dict(izip(keys, values))

Result for all cases:

In all cases:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

Explanation:

If we look at the help on dict we see that it takes a variety of forms of arguments:


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

The optimal approach is to use an iterable while avoiding creating unnecessary data structures. In Python 2, zip creates an unnecessary list:

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

In Python 3, the equivalent would be:

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

and Python 3’s zip merely creates an iterable object:

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

Since we want to avoid creating unnecessary data structures, we usually want to avoid Python 2’s zip (since it creates an unnecessary list).

Less performant alternatives:

This is a generator expression being passed to the dict constructor:

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

or equivalently:

dict((k, v) for k, v in zip(keys, values))

And this is a list comprehension being passed to the dict constructor:

dict([(k, v) for k, v in zip(keys, values)])

In the first two cases, an extra layer of non-operative (thus unnecessary) computation is placed over the zip iterable, and in the case of the list comprehension, an extra list is unnecessarily created. I would expect all of them to be less performant, and certainly not more-so.

Performance review:

In 64 bit Python 3.8.2 provided by Nix, on Ubuntu 16.04, ordered from fastest to slowest:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>> 
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583

dict(zip(keys, values)) wins even with small sets of keys and values, but for larger sets, the differences in performance will become greater.

A commenter said:

min seems like a bad way to compare performance. Surely mean and/or max would be much more useful indicators for real usage.

We use min because these algorithms are deterministic. We want to know the performance of the algorithms under the best conditions possible.

If the operating system hangs for any reason, it has nothing to do with what we’re trying to compare, so we need to exclude those kinds of results from our analysis.

If we used mean, those kinds of events would skew our results greatly, and if we used max we will only get the most extreme result – the one most likely affected by such an event.

A commenter also says:

In python 3.6.8, using mean values, the dict comprehension is indeed still faster, by about 30% for these small lists. For larger lists (10k random numbers), the dict call is about 10% faster.

I presume we mean dict(zip(... with 10k random numbers. That does sound like a fairly unusual use case. It does makes sense that the most direct calls would dominate in large datasets, and I wouldn’t be surprised if OS hangs are dominating given how long it would take to run that test, further skewing your numbers. And if you use mean or max I would consider your results meaningless.

Let’s use a more realistic size on our top examples:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

And we see here that dict(zip(... does indeed run faster for larger datasets by about 20%.

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

回答 2

尝试这个:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

在Python 2中,与相比,它在内存消耗方面更经济zip

Try this:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

In Python 2, it’s also more economical in memory consumption compared to zip.


回答 3

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}

回答 4

您还可以在≥2.7的Python中使用字典理解:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

You can also use dictionary comprehensions in Python ≥ 2.7:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

回答 5

一种更自然的方法是使用字典理解

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

A more natural way is to use dictionary comprehension

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

回答 6

如果需要在创建字典之前转换键或值,则可以使用生成器表达式。例:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

看看像Pythonista一样的代码:惯用Python

If you need to transform keys or values before creating a dictionary then a generator expression could be used. Example:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

Take a look Code Like a Pythonista: Idiomatic Python.


回答 7

使用Python 3.x进行dict理解

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

有关dict理解的更多信息,这里有一个示例:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

with Python 3.x, goes for dict comprehensions

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

More on dict comprehensions here, an example is there:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

回答 8

对于那些需要简单代码并且不熟悉的人zip

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

这可以通过一行代码来完成:

d = {List1[n]: List2[n] for n in range(len(List1))}

For those who need simple code and aren’t familiar with zip:

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

This can be done by one line of code:

d = {List1[n]: List2[n] for n in range(len(List1))}

回答 9

  • 2018-04-18

最好的解决方案仍然是:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

整理一下:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')
  • 2018-04-18

The best solution is still:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

Tranpose it:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')

回答 10

您可以使用以下代码:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

但是请确保列表的长度相同。如果长度不相同,则zip函数会将较长的列表进行分类。

you can use this below code:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

But make sure that length of the lists will be same.if length is not same.then zip function turncate the longer one.


回答 11

我在尝试解决与图形相关的问题时有这个疑问。我遇到的问题是我需要定义一个空的邻接列表,并想用一个空列表初始化所有节点,那是当我想到如何检查它是否足够快时,我的意思是说值得进行zip操作而不是简单的分配键值对。在大多数情况下,时间因素是重要的破冰者。因此,我对两种方法都执行了timeit操作。

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

对于n_nodes = 10,000,000我得到了,

迭代:2.825081646999024速记:3.535717916001886

迭代:5.051560923002398速记:6.255070794999483

迭代:6.52859034499852速记:8.221581164998497

迭代:8.683652416999394速记:12.599181543999293

迭代:11.587241565001023速记:15.27298851100204

迭代:14.816342867001367速记:17.162912737003353

迭代:16.645022411001264速记:19.976680120998935

您可以清楚地看到在某一点之后,第n_步的迭代方法超过了第n-1_步的速记方法所花费的时间。

I had this doubt while I was trying to solve a graph-related problem. The issue I had was I needed to define an empty adjacency list and wanted to initialize all the nodes with an empty list, that’s when I thought how about I check if it is fast enough, I mean if it will be worth doing a zip operation rather than simple assignment key-value pair. After all most of the times, the time factor is an important ice breaker. So I performed timeit operation for both approaches.

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

For n_nodes = 10,000,000 I get,

Iteration: 2.825081646999024 Shorthand: 3.535717916001886

Iteration: 5.051560923002398 Shorthand: 6.255070794999483

Iteration: 6.52859034499852 Shorthand: 8.221581164998497

Iteration: 8.683652416999394 Shorthand: 12.599181543999293

Iteration: 11.587241565001023 Shorthand: 15.27298851100204

Iteration: 14.816342867001367 Shorthand: 17.162912737003353

Iteration: 16.645022411001264 Shorthand: 19.976680120998935

You can clearly see after a certain point, iteration approach at n_th step overtakes the time taken by shorthand approach at n-1_th step.


回答 12

这也是在字典中添加列表值的示例

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

始终确保您的“键”(list1)始终在第一个参数中。

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

Here is also an example of adding a list value in you dictionary

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

always make sure the your “Key”(list1) is always in the first parameter.

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

回答 13

作为字典理解的解决方案,带有枚举:

dict = {item : values[index] for index, item in enumerate(keys)}

与枚举循环的解决方案:

dict = {}
for index, item in enumerate(keys):
    dict[item] = values[index]

Solution as dictionary comprehension with enumerate:

dict = {item : values[index] for index, item in enumerate(keys)}

Solution as for loop with enumerate:

dict = {}
for index, item in enumerate(keys):
    dict[item] = values[index]

回答 14

您也可以尝试将两个列表组合在一起的一个列表;)

a = [1,2,3,4]
n = [5,6,7,8]

x = []
for i in a,n:
    x.append(i)

print(dict(zip(x[0], x[1])))

You may also try with one list which is a combination of two lists ;)

a = [1,2,3,4]
n = [5,6,7,8]

x = []
for i in a,n:
    x.append(i)

print(dict(zip(x[0], x[1])))

回答 15

没有zip功能的方法

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

method without zip function

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

使用“ for”循环遍历字典

问题:使用“ for”循环遍历字典

以下代码使我有些困惑:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    print key, 'corresponds to', d[key]

我不明白的是那key部分。Python如何识别它仅需要从字典中读取密钥?是keyPython中的特殊字?还是仅仅是一个变量?

I am a bit puzzled by the following code:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    print key, 'corresponds to', d[key]

What I don’t understand is the key portion. How does Python recognize that it needs only to read the key from the dictionary? Is key a special word in Python? Or is it simply a variable?


回答 0

key 只是一个变量名。

for key in d:

只会循环遍历字典中的键,而不是键和值。要遍历键和值,可以使用以下命令:

对于Python 3.x:

for key, value in d.items():

对于Python 2.x:

for key, value in d.iteritems():

要测试自己,请将单词更改keypoop

在Python 3.x中,iteritems()替换为simple items(),它返回由dict支持的类似set的视图,iteritems()但效果更好。在2.7中也可用viewitems()

该操作items()将对2和3都适用,但是在2中,它将返回字典(key, value)对的列表,该列表将不反映items()调用后发生的字典更改。如果要在3.x中使用2.x行为,可以调用list(d.items())

key is just a variable name.

for key in d:

will simply loop over the keys in the dictionary, rather than the keys and values. To loop over both key and value you can use the following:

For Python 3.x:

for key, value in d.items():

For Python 2.x:

for key, value in d.iteritems():

To test for yourself, change the word key to poop.

In Python 3.x, iteritems() was replaced with simply items(), which returns a set-like view backed by the dict, like iteritems() but even better. This is also available in 2.7 as viewitems().

The operation items() will work for both 2 and 3, but in 2 it will return a list of the dictionary’s (key, value) pairs, which will not reflect changes to the dict that happen after the items() call. If you want the 2.x behavior in 3.x, you can call list(d.items()).


回答 1

并不是说键是一个特殊的词,而是字典实现了迭代器协议。您可以在您的类中执行此操作,例如,有关如何构建类迭代器的信息,请参见此问题

对于字典,它是在C级别实现的。详细信息在PEP 234中可用。特别是标题为“字典迭代器”的部分:

  • 字典实现了一个tp_iter插槽,该插槽返回一个有效的迭代器,该迭代器对字典的键进行迭代。[…]这意味着我们可以写

    for k in dict: ...

    等同于,但是比

    for k in dict.keys(): ...

    只要不违反对字典修改的限制(无论是通过循环还是通过另一个线程)。

  • 将方法添加到字典中,以显式返回不同种类的迭代器:

    for key in dict.iterkeys(): ...
    
    for value in dict.itervalues(): ...
    
    for key, value in dict.iteritems(): ...

    for x in dict是的简写for x in dict.iterkeys()

在Python 3中dict.iterkeys()dict.itervalues()dict.iteritems()不再受支持。使用dict.keys()dict.values()dict.items()代替。

It’s not that key is a special word, but that dictionaries implement the iterator protocol. You could do this in your class, e.g. see this question for how to build class iterators.

In the case of dictionaries, it’s implemented at the C level. The details are available in PEP 234. In particular, the section titled “Dictionary Iterators”:

  • Dictionaries implement a tp_iter slot that returns an efficient iterator that iterates over the keys of the dictionary. […] This means that we can write

    for k in dict: ...
    

    which is equivalent to, but much faster than

    for k in dict.keys(): ...
    

    as long as the restriction on modifications to the dictionary (either by the loop or by another thread) are not violated.

  • Add methods to dictionaries that return different kinds of iterators explicitly:

    for key in dict.iterkeys(): ...
    
    for value in dict.itervalues(): ...
    
    for key, value in dict.iteritems(): ...
    

    This means that for x in dict is shorthand for for x in dict.iterkeys().

In Python 3, dict.iterkeys(), dict.itervalues() and dict.iteritems() are no longer supported. Use dict.keys(), dict.values() and dict.items() instead.


回答 2

遍历一个dict通过其按键迭代没有特定的顺序,你可以在这里看到:

编辑:(Python3.6中不再是这种情况,但是请注意,尚不能保证它的行为)

>>> d = {'x': 1, 'y': 2, 'z': 3} 
>>> list(d)
['y', 'x', 'z']
>>> d.keys()
['y', 'x', 'z']

对于您的示例,最好使用dict.items()

>>> d.items()
[('y', 2), ('x', 1), ('z', 3)]

这给您一个元组列表。当你遍历他们这个样子,每个元组是解压到kv自动:

for k,v in d.items():
    print(k, 'corresponds to', v)

如果循环的主体只有几行,则在遍历a时使用kv作为变量名dict非常普遍。对于更复杂的循环,最好使用更具描述性的名称:

for letter, number in d.items():
    print(letter, 'corresponds to', number)

养成使用格式字符串的习惯是一个好主意:

for letter, number in d.items():
    print('{0} corresponds to {1}'.format(letter, number))

Iterating over a dict iterates through its keys in no particular order, as you can see here:

Edit: (This is no longer the case in Python3.6, but note that it’s not guaranteed behaviour yet)

>>> d = {'x': 1, 'y': 2, 'z': 3} 
>>> list(d)
['y', 'x', 'z']
>>> d.keys()
['y', 'x', 'z']

For your example, it is a better idea to use dict.items():

>>> d.items()
[('y', 2), ('x', 1), ('z', 3)]

This gives you a list of tuples. When you loop over them like this, each tuple is unpacked into k and v automatically:

for k,v in d.items():
    print(k, 'corresponds to', v)

Using k and v as variable names when looping over a dict is quite common if the body of the loop is only a few lines. For more complicated loops it may be a good idea to use more descriptive names:

for letter, number in d.items():
    print(letter, 'corresponds to', number)

It’s a good idea to get into the habit of using format strings:

for letter, number in d.items():
    print('{0} corresponds to {1}'.format(letter, number))

回答 3

key 只是一个变量。

对于Python2.X

d = {'x': 1, 'y': 2, 'z': 3} 
for my_var in d:
    print my_var, 'corresponds to', d[my_var]

… 或更好,

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.iteritems():
    print the_key, 'corresponds to', the_value

对于Python3.X

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.items():
    print(the_key, 'corresponds to', the_value)

key is simply a variable.

For Python2.X:

d = {'x': 1, 'y': 2, 'z': 3} 
for my_var in d:
    print my_var, 'corresponds to', d[my_var]

… or better,

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.iteritems():
    print the_key, 'corresponds to', the_value

For Python3.X:

d = {'x': 1, 'y': 2, 'z': 3} 
for the_key, the_value in d.items():
    print(the_key, 'corresponds to', the_value)

回答 4

当您使用for .. in ..-syntax 遍历字典时,它总是在键上进行遍历(使用可以访问值dictionary[key])。

要遍历键值对,请在Python 2中使用for k,v in s.iteritems(),在Python 3中for k,v in s.items()

When you iterate through dictionaries using the for .. in ..-syntax, it always iterates over the keys (the values are accessible using dictionary[key]).

To iterate over key-value pairs, in Python 2 use for k,v in s.iteritems(), and in Python 3 for k,v in s.items().


回答 5

这是一个非常常见的循环习惯用法。in是运算符。有关何时使用for key in dict和何时使用的信息,for key in dict.keys()请参阅David Goodger的Idiomatic Python文章(归档副本)

This is a very common looping idiom. in is an operator. For when to use for key in dict and when it must be for key in dict.keys() see David Goodger’s Idiomatic Python article (archived copy).


回答 6

使用“ for”循环遍历字典

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    ...

Python如何识别它仅需要从字典中读取密钥?关键字在Python中是一个特殊的词吗?还是仅仅是一个变量?

不只是for循环。这里重要的词是“迭代”。

字典是键到值的映射:

d = {'x': 1, 'y': 2, 'z': 3} 

每当我们遍历它时,我们都会遍历键。变量名key仅是描述性的,非常适合此目的。

这发生在列表理解中:

>>> [k for k in d]
['x', 'y', 'z']

当我们将字典传递到列表(或任何其他集合类型对象)时,就会发生这种情况:

>>> list(d)
['x', 'y', 'z']

Python迭代的方式是在需要的上下文中调用__iter__对象的方法(在这种情况下为字典),该方法返回迭代器(在这种情况下为keyiterator对象):

>>> d.__iter__()
<dict_keyiterator object at 0x7fb1747bee08>

我们不应该自己使用这些特殊方法,而是使用各自的内置函数来调用它iter

>>> key_iterator = iter(d)
>>> key_iterator
<dict_keyiterator object at 0x7fb172fa9188>

迭代器有一个__next__方法-但我们使用内置函数来调用它next

>>> next(key_iterator)
'x'
>>> next(key_iterator)
'y'
>>> next(key_iterator)
'z'
>>> next(key_iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

当迭代器用尽时,它将引发StopIteration。这就是Python知道退出for循环,列表理解,生成器表达式或任何其他迭代上下文的方式。迭代器一旦引发,StopIteration它就会一直引发-如果您想再次进行迭代,则需要一个新的迭代器。

>>> list(key_iterator)
[]
>>> new_key_iterator = iter(d)
>>> list(new_key_iterator)
['x', 'y', 'z']

返回字典

我们已经看到在许多情况下都会反复进行命令。我们看到的是,每当我们迭代一个字典时,我们都会得到密钥。回到原始示例:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:

如果我们更改变量名,我们仍然会得到键。让我们尝试一下:

>>> for each_key in d:
...     print(each_key, '=>', d[each_key])
... 
x => 1
y => 2
z => 3

如果要遍历值,则需要使用.valuesdicts方法,或同时使用dicts方法.items

>>> list(d.values())
[1, 2, 3]
>>> list(d.items())
[('x', 1), ('y', 2), ('z', 3)]

在给定的示例中,迭代如下所示的项将更加有效:

for a_key, corresponding_value in d.items():
    print(a_key, corresponding_value)

但是出于学术目的,这个问题的例子很好。

Iterating over dictionaries using ‘for’ loops

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:
    ...

How does Python recognize that it needs only to read the key from the dictionary? Is key a special word in Python? Or is it simply a variable?

It’s not just for loops. The important word here is “iterating”.

A dictionary is a mapping of keys to values:

d = {'x': 1, 'y': 2, 'z': 3} 

Any time we iterate over it, we iterate over the keys. The variable name key is only intended to be descriptive – and it is quite apt for the purpose.

This happens in a list comprehension:

>>> [k for k in d]
['x', 'y', 'z']

It happens when we pass the dictionary to list (or any other collection type object):

>>> list(d)
['x', 'y', 'z']

The way Python iterates is, in a context where it needs to, it calls the __iter__ method of the object (in this case the dictionary) which returns an iterator (in this case, a keyiterator object):

>>> d.__iter__()
<dict_keyiterator object at 0x7fb1747bee08>

We shouldn’t use these special methods ourselves, instead, use the respective builtin function to call it, iter:

>>> key_iterator = iter(d)
>>> key_iterator
<dict_keyiterator object at 0x7fb172fa9188>

Iterators have a __next__ method – but we call it with the builtin function, next:

>>> next(key_iterator)
'x'
>>> next(key_iterator)
'y'
>>> next(key_iterator)
'z'
>>> next(key_iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

When an iterator is exhausted, it raises StopIteration. This is how Python knows to exit a for loop, or a list comprehension, or a generator expression, or any other iterative context. Once an iterator raises StopIteration it will always raise it – if you want to iterate again, you need a new one.

>>> list(key_iterator)
[]
>>> new_key_iterator = iter(d)
>>> list(new_key_iterator)
['x', 'y', 'z']

Returning to dicts

We’ve seen dicts iterating in many contexts. What we’ve seen is that any time we iterate over a dict, we get the keys. Back to the original example:

d = {'x': 1, 'y': 2, 'z': 3} 
for key in d:

If we change the variable name, we still get the keys. Let’s try it:

>>> for each_key in d:
...     print(each_key, '=>', d[each_key])
... 
x => 1
y => 2
z => 3

If we want to iterate over the values, we need to use the .values method of dicts, or for both together, .items:

>>> list(d.values())
[1, 2, 3]
>>> list(d.items())
[('x', 1), ('y', 2), ('z', 3)]

In the example given, it would be more efficient to iterate over the items like this:

for a_key, corresponding_value in d.items():
    print(a_key, corresponding_value)

But for academic purposes, the question’s example is just fine.


回答 7

我有一个用例,我必须遍历字典以获取键,值对以及指示我在哪里的索引。这是我的方法:

d = {'x': 1, 'y': 2, 'z': 3} 
for i, (key, value) in enumerate(d.items()):
   print(i, key, value)

请注意,键值周围的括号很重要,如果没有括号,则会出现ValueError“没有足够的值要解压”。

I have a use case where I have to iterate through the dict to get the key, value pair, also the index indicating where I am. This is how I do it:

d = {'x': 1, 'y': 2, 'z': 3} 
for i, (key, value) in enumerate(d.items()):
   print(i, key, value)

Note that the parentheses around the key, value is important, without the parentheses, you get an ValueError “not enough values to unpack”.


回答 8

您可以dicttype在GitHub上检查CPython的实现。这是实现dict迭代器的方法的签名:

_PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey,
             PyObject **pvalue, Py_hash_t *phash)

CPython的dictobject.c

You can check the implementation of CPython’s dicttype on GitHub. This is the signature of method that implements the dict iterator:

_PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey,
             PyObject **pvalue, Py_hash_t *phash)

CPython dictobject.c


回答 9

要遍历键,使用起来比较慢,但效果更好my_dict.keys()。如果您尝试执行以下操作:

for key in my_dict:
    my_dict[key+"-1"] = my_dict[key]-1

这将导致运行时错误,因为在程序运行时更改了密钥。如果您绝对希望减少时间,请使用此for key in my_dict方法,但已被警告;)。

To iterate over keys, it is slower but better to use my_dict.keys(). If you tried to do something like this:

for key in my_dict:
    my_dict[key+"-1"] = my_dict[key]-1

it would create a runtime error because you are changing the keys while the program is running. If you are absolutely set on reducing time, use the for key in my_dict way, but you have been warned ;).


回答 10

这将按照值的升序打印输出。

d = {'x': 3, 'y': 1, 'z': 2}
def by_value(item):
    return item[1]

for key, value in sorted(d.items(), key=by_value):
    print(key, '->', value)

输出:

在此处输入图片说明

This will print the output in Sorted order by Values in ascending order.

d = {'x': 3, 'y': 1, 'z': 2}
def by_value(item):
    return item[1]

for key, value in sorted(d.items(), key=by_value):
    print(key, '->', value)

Output:

enter image description here