标签归档:dictionary

用字典重新映射熊猫列中的值

问题:用字典重新映射熊猫列中的值

我有一本字典,看起来像这样: di = {1: "A", 2: "B"}

我想将其应用于类似于以下内容的数据框的“ col1”列:

     col1   col2
0       w      a
1       1      2
2       2    NaN

要得到:

     col1   col2
0       w      a
1       A      2
2       B    NaN

我怎样才能最好地做到这一点?由于某种原因,与此相关的谷歌搜索术语仅向我显示了有关如何根据字典创建列的链接,反之亦然:-/

I have a dictionary which looks like this: di = {1: "A", 2: "B"}

I would like to apply it to the “col1” column of a dataframe similar to:

     col1   col2
0       w      a
1       1      2
2       2    NaN

to get:

     col1   col2
0       w      a
1       A      2
2       B    NaN

How can I best do this? For some reason googling terms relating to this only shows me links about how to make columns from dicts and vice-versa :-/


回答 0

您可以使用.replace。例如:

>>> df = pd.DataFrame({'col2': {0: 'a', 1: 2, 2: np.nan}, 'col1': {0: 'w', 1: 1, 2: 2}})
>>> di = {1: "A", 2: "B"}
>>> df
  col1 col2
0    w    a
1    1    2
2    2  NaN
>>> df.replace({"col1": di})
  col1 col2
0    w    a
1    A    2
2    B  NaN

或直接在上Series,即df["col1"].replace(di, inplace=True)

You can use .replace. For example:

>>> df = pd.DataFrame({'col2': {0: 'a', 1: 2, 2: np.nan}, 'col1': {0: 'w', 1: 1, 2: 2}})
>>> di = {1: "A", 2: "B"}
>>> df
  col1 col2
0    w    a
1    1    2
2    2  NaN
>>> df.replace({"col1": di})
  col1 col2
0    w    a
1    A    2
2    B  NaN

or directly on the Series, i.e. df["col1"].replace(di, inplace=True).


回答 1

map 可以比 replace

如果您的字典有多个键,使用map速度可能比快得多replace。此方法有两种版本,具体取决于字典是否详尽地映射所有可能的值(以及是否要让不匹配项保留其值或将其转换为NaN):

详尽的映射

在这种情况下,表格非常简单:

df['col1'].map(di)       # note: if the dictionary does not exhaustively map all
                         # entries then non-matched entries are changed to NaNs

尽管map最常用函数作为参数,但也可以选择字典或系列: Pandas.series.map的文档

非穷举映射

如果您有一个非详尽的映射,并且希望保留现有变量用于非匹配,则可以添加fillna

df['col1'].map(di).fillna(df['col1'])

如@jpp的答案在这里: 通过字典有效地替换熊猫系列中的值

基准测试

在pandas 0.23.1版中使用以下数据:

di = {1: "A", 2: "B", 3: "C", 4: "D", 5: "E", 6: "F", 7: "G", 8: "H" }
df = pd.DataFrame({ 'col1': np.random.choice( range(1,9), 100000 ) })

并进行测试时%timeit,它的map速度大约比速度快10倍replace

请注意,您的加速map会随数据而变化。最大的提速似乎是使用大词典和详尽的替换方法。有关更广泛的基准测试和讨论,请参见@jpp答案(上面链接)。

map can be much faster than replace

If your dictionary has more than a couple of keys, using map can be much faster than replace. There are two versions of this approach, depending on whether your dictionary exhaustively maps all possible values (and also whether you want non-matches to keep their values or be converted to NaNs):

Exhaustive Mapping

In this case, the form is very simple:

df['col1'].map(di)       # note: if the dictionary does not exhaustively map all
                         # entries then non-matched entries are changed to NaNs

Although map most commonly takes a function as its argument, it can alternatively take a dictionary or series: Documentation for Pandas.series.map

Non-Exhaustive Mapping

If you have a non-exhaustive mapping and wish to retain the existing variables for non-matches, you can add fillna:

df['col1'].map(di).fillna(df['col1'])

as in @jpp’s answer here: Replace values in a pandas series via dictionary efficiently

Benchmarks

Using the following data with pandas version 0.23.1:

di = {1: "A", 2: "B", 3: "C", 4: "D", 5: "E", 6: "F", 7: "G", 8: "H" }
df = pd.DataFrame({ 'col1': np.random.choice( range(1,9), 100000 ) })

and testing with %timeit, it appears that map is approximately 10x faster than replace.

Note that your speedup with map will vary with your data. The largest speedup appears to be with large dictionaries and exhaustive replaces. See @jpp answer (linked above) for more extensive benchmarks and discussion.


回答 2

您的问题有点含糊。至少有三种解释:

  1. 中的键di引用索引值
  2. 中的键是didf['col1']
  3. 中的键di指的是索引位置(不是OP的问题,而是为了娱乐而抛出的。)

以下是每种情况的解决方案。


情况1: 如果的键di旨在引用索引值,则可以使用以下update方法:

df['col1'].update(pd.Series(di))

例如,

import pandas as pd
import numpy as np

df = pd.DataFrame({'col1':['w', 10, 20],
                   'col2': ['a', 30, np.nan]},
                  index=[1,2,0])
#   col1 col2
# 1    w    a
# 2   10   30
# 0   20  NaN

di = {0: "A", 2: "B"}

# The value at the 0-index is mapped to 'A', the value at the 2-index is mapped to 'B'
df['col1'].update(pd.Series(di))
print(df)

Yield

  col1 col2
1    w    a
2    B   30
0    A  NaN

我已经修改了您原始帖子中的值,因此操作更清晰update。注意输入中的键如何di与索引值关联。索引值的顺序(即索引位置)无关紧要。


情况2: 如果其中的键di引用df['col1']值,则@DanAllan和@DSM显示如何通过以下方法实现此目的replace

import pandas as pd
import numpy as np

df = pd.DataFrame({'col1':['w', 10, 20],
                   'col2': ['a', 30, np.nan]},
                  index=[1,2,0])
print(df)
#   col1 col2
# 1    w    a
# 2   10   30
# 0   20  NaN

di = {10: "A", 20: "B"}

# The values 10 and 20 are replaced by 'A' and 'B'
df['col1'].replace(di, inplace=True)
print(df)

Yield

  col1 col2
1    w    a
2    A   30
0    B  NaN

注意如何在这种情况下,在键di改为匹配df['col1']


情况3: 如果其中的键di引用了索引位置,则可以使用

df['col1'].put(di.keys(), di.values())

以来

df = pd.DataFrame({'col1':['w', 10, 20],
                   'col2': ['a', 30, np.nan]},
                  index=[1,2,0])
di = {0: "A", 2: "B"}

# The values at the 0 and 2 index locations are replaced by 'A' and 'B'
df['col1'].put(di.keys(), di.values())
print(df)

Yield

  col1 col2
1    A    a
2   10   30
0    B  NaN

在这里,第一行和第三行被更改了,因为其中的键di02,使用Python基于0的索引对其进行索引,它们指向第一位置和第三位置。

There is a bit of ambiguity in your question. There are at least three two interpretations:

  1. the keys in di refer to index values
  2. the keys in di refer to df['col1'] values
  3. the keys in di refer to index locations (not the OP’s question, but thrown in for fun.)

Below is a solution for each case.


Case 1: If the keys of di are meant to refer to index values, then you could use the update method:

df['col1'].update(pd.Series(di))

For example,

import pandas as pd
import numpy as np

df = pd.DataFrame({'col1':['w', 10, 20],
                   'col2': ['a', 30, np.nan]},
                  index=[1,2,0])
#   col1 col2
# 1    w    a
# 2   10   30
# 0   20  NaN

di = {0: "A", 2: "B"}

# The value at the 0-index is mapped to 'A', the value at the 2-index is mapped to 'B'
df['col1'].update(pd.Series(di))
print(df)

yields

  col1 col2
1    w    a
2    B   30
0    A  NaN

I’ve modified the values from your original post so it is clearer what update is doing. Note how the keys in di are associated with index values. The order of the index values — that is, the index locations — does not matter.


Case 2: If the keys in di refer to df['col1'] values, then @DanAllan and @DSM show how to achieve this with replace:

import pandas as pd
import numpy as np

df = pd.DataFrame({'col1':['w', 10, 20],
                   'col2': ['a', 30, np.nan]},
                  index=[1,2,0])
print(df)
#   col1 col2
# 1    w    a
# 2   10   30
# 0   20  NaN

di = {10: "A", 20: "B"}

# The values 10 and 20 are replaced by 'A' and 'B'
df['col1'].replace(di, inplace=True)
print(df)

yields

  col1 col2
1    w    a
2    A   30
0    B  NaN

Note how in this case the keys in di were changed to match values in df['col1'].


Case 3: If the keys in di refer to index locations, then you could use

df['col1'].put(di.keys(), di.values())

since

df = pd.DataFrame({'col1':['w', 10, 20],
                   'col2': ['a', 30, np.nan]},
                  index=[1,2,0])
di = {0: "A", 2: "B"}

# The values at the 0 and 2 index locations are replaced by 'A' and 'B'
df['col1'].put(di.keys(), di.values())
print(df)

yields

  col1 col2
1    A    a
2   10   30
0    B  NaN

Here, the first and third rows were altered, because the keys in di are 0 and 2, which with Python’s 0-based indexing refer to the first and third locations.


回答 3

如果您有多个列要在数据数据帧中重新映射,则添加到此问题:

def remap(data,dict_labels):
    """
    This function take in a dictionnary of labels : dict_labels 
    and replace the values (previously labelencode) into the string.

    ex: dict_labels = {{'col1':{1:'A',2:'B'}}

    """
    for field,values in dict_labels.items():
        print("I am remapping %s"%field)
        data.replace({field:values},inplace=True)
    print("DONE")

    return data

希望它对某人有用。

干杯

Adding to this question if you ever have more than one columns to remap in a data dataframe:

def remap(data,dict_labels):
    """
    This function take in a dictionnary of labels : dict_labels 
    and replace the values (previously labelencode) into the string.

    ex: dict_labels = {{'col1':{1:'A',2:'B'}}

    """
    for field,values in dict_labels.items():
        print("I am remapping %s"%field)
        data.replace({field:values},inplace=True)
    print("DONE")

    return data

Hope it can be useful to someone.

Cheers


回答 4

DSM已经接受了答案,但是编码似乎并不适合所有人。这是与当前版本的熊猫一起使用的版本(截至8/2018为0.23.4):

import pandas as pd

df = pd.DataFrame({'col1': [1, 2, 2, 3, 1],
            'col2': ['negative', 'positive', 'neutral', 'neutral', 'positive']})

conversion_dict = {'negative': -1, 'neutral': 0, 'positive': 1}
df['converted_column'] = df['col2'].replace(conversion_dict)

print(df.head())

您会看到它看起来像:

   col1      col2  converted_column
0     1  negative                -1
1     2  positive                 1
2     2   neutral                 0
3     3   neutral                 0
4     1  positive                 1

pandas.DataFrame.replace的文档在这里

DSM has the accepted answer, but the coding doesn’t seem to work for everyone. Here is one that works with the current version of pandas (0.23.4 as of 8/2018):

import pandas as pd

df = pd.DataFrame({'col1': [1, 2, 2, 3, 1],
            'col2': ['negative', 'positive', 'neutral', 'neutral', 'positive']})

conversion_dict = {'negative': -1, 'neutral': 0, 'positive': 1}
df['converted_column'] = df['col2'].replace(conversion_dict)

print(df.head())

You’ll see it looks like:

   col1      col2  converted_column
0     1  negative                -1
1     2  positive                 1
2     2   neutral                 0
3     3   neutral                 0
4     1  positive                 1

The docs for pandas.DataFrame.replace are here.


回答 5

或做apply

df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))

演示:

>>> df['col1']=df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))
>>> df
  col1 col2
0    w    a
1    1    2
2    2  NaN
>>> 

Or do apply:

df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))

Demo:

>>> df['col1']=df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))
>>> df
  col1 col2
0    w    a
1    1    2
2    2  NaN
>>> 

回答 6

给定map的速度比替换(@JohnE的解决方案)要快,因此在打算将特定值映射到的非穷举映射时,您NaN需要格外小心。在这种情况下,正确的方法需要在您mask使用Series时执行.fillna,否则撤消到的映射NaN

import pandas as pd
import numpy as np

d = {'m': 'Male', 'f': 'Female', 'missing': np.NaN}
df = pd.DataFrame({'gender': ['m', 'f', 'missing', 'Male', 'U']})

keep_nan = [k for k,v in d.items() if pd.isnull(v)]
s = df['gender']

df['mapped'] = s.map(d).fillna(s.mask(s.isin(keep_nan)))

    gender  mapped
0        m    Male
1        f  Female
2  missing     NaN
3     Male    Male
4        U       U

Given map is faster than replace (@JohnE’s solution) you need to be careful with Non-Exhaustive mappings where you intend to map specific values to NaN. The proper method in this case requires that you mask the Series when you .fillna, else you undo the mapping to NaN.

import pandas as pd
import numpy as np

d = {'m': 'Male', 'f': 'Female', 'missing': np.NaN}
df = pd.DataFrame({'gender': ['m', 'f', 'missing', 'Male', 'U']})

keep_nan = [k for k,v in d.items() if pd.isnull(v)]
s = df['gender']

df['mapped'] = s.map(d).fillna(s.mask(s.isin(keep_nan)))

    gender  mapped
0        m    Male
1        f  Female
2  missing     NaN
3     Male    Male
4        U       U

回答 7

一个很好的完整解决方案,可以保留您的类标签的地图:

labels = features['col1'].unique()
labels_dict = dict(zip(labels, range(len(labels))))
features = features.replace({"col1": labels_dict})

这样,您可以随时从labels_dict引用原始类标签。

A nice complete solution that keeps a map of your class labels:

labels = features['col1'].unique()
labels_dict = dict(zip(labels, range(len(labels))))
features = features.replace({"col1": labels_dict})

This way, you can at any point refer to the original class label from labels_dict.


回答 8

作为对Nico Coallier(适用于多列)和U10-Forward(使用应用方式的方法)的建议的扩展,并将其概括为一个单一的行:

df.loc[:,['col1','col2']].transform(lambda x: x.map(lambda x: {1: "A", 2: "B"}.get(x,x))

.transform()每个列按顺序处理。.apply()与之相反,将DataFrame中聚集的列传递给该列。

因此,您可以应用Series方法map()

最后,由于U10,我发现了此行为,您可以在.get()表达式中使用整个Series。除非我误解了它的行为,并且它按顺序而不是按位处理序列。您在映射字典中未提及的值
.get(x,x)帐户,否则该.map()方法将被视为Nan

As an extension to what have been proposed by Nico Coallier (apply to multiple columns) and U10-Forward(using apply style of methods), and summarising it into a one-liner I propose:

df.loc[:,['col1','col2']].transform(lambda x: x.map(lambda x: {1: "A", 2: "B"}.get(x,x))

The .transform() processes each column as a series. Contrary to .apply()which passes the columns aggregated in a DataFrame.

Consequently you can apply the Series method map().

Finally, and I discovered this behaviour thanks to U10, you can use the whole Series in the .get() expression. Unless I have misunderstood its behaviour and it processes sequentially the series instead of bitwisely.
The .get(x,x)accounts for the values you did not mention in your mapping dictionary which would be considered as Nan otherwise by the .map() method


回答 9

一种更本地的熊猫方法是应用如下替换函数:

def multiple_replace(dict, text):
  # Create a regular expression  from the dictionary keys
  regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))

  # For each match, look-up corresponding value in dictionary
  return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text) 

定义函数后,可以将其应用于数据框。

di = {1: "A", 2: "B"}
df['col1'] = df.apply(lambda row: multiple_replace(di, row['col1']), axis=1)

A more native pandas approach is to apply a replace function as below:

def multiple_replace(dict, text):
  # Create a regular expression  from the dictionary keys
  regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))

  # For each match, look-up corresponding value in dictionary
  return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text) 

Once you defined the function, you can apply it to your dataframe.

di = {1: "A", 2: "B"}
df['col1'] = df.apply(lambda row: multiple_replace(di, row['col1']), axis=1)

从Python字典对象中提取键/值对的子集?

问题:从Python字典对象中提取键/值对的子集?

我有一个大的字典对象,其中有几个键值对(约16个),但我只对其中的3个感兴趣。什么是最好的方式(最短/最有效/最优雅)?

我所知道的是:

bigdict = {'a':1,'b':2,....,'z':26} 
subdict = {'l':bigdict['l'], 'm':bigdict['m'], 'n':bigdict['n']}

我相信还有比这更优雅的方法。有想法吗?

I have a big dictionary object that has several key value pairs (about 16), but I am only interested in 3 of them. What is the best way (shortest/efficient/most elegant) to achieve that?

The best I know is:

bigdict = {'a':1,'b':2,....,'z':26} 
subdict = {'l':bigdict['l'], 'm':bigdict['m'], 'n':bigdict['n']}

I am sure there is a more elegant way than this. Ideas?


回答 0

您可以尝试:

dict((k, bigdict[k]) for k in ('l', 'm', 'n'))

…或在 Python 3Python 2.7或更高版本(感谢FábioDiniz指出它也适用于2.7)

{k: bigdict[k] for k in ('l', 'm', 'n')}

更新:正如HåvardS指出的那样,我假设您知道密钥将在字典中- 如果您无法做出此假设,请参阅他的答案。另外,正如timbo在评论中指出的那样,如果您希望将缺少的bigdict键映射到None,则可以执行以下操作:

{k: bigdict.get(k, None) for k in ('l', 'm', 'n')}

如果您使用的是Python 3,并且希望原始字典中实际存在的新字典中的键,则可以使用事实来查看对象实现一些set操作:

{k: bigdict[k] for k in bigdict.keys() & {'l', 'm', 'n'}}

You could try:

dict((k, bigdict[k]) for k in ('l', 'm', 'n'))

… or in Python 3 Python versions 2.7 or later (thanks to Fábio Diniz for pointing that out that it works in 2.7 too):

{k: bigdict[k] for k in ('l', 'm', 'n')}

Update: As Håvard S points out, I’m assuming that you know the keys are going to be in the dictionary – see his answer if you aren’t able to make that assumption. Alternatively, as timbo points out in the comments, if you want a key that’s missing in bigdict to map to None, you can do:

{k: bigdict.get(k, None) for k in ('l', 'm', 'n')}

If you’re using Python 3, and you only want keys in the new dict that actually exist in the original one, you can use the fact to view objects implement some set operations:

{k: bigdict[k] for k in bigdict.keys() & {'l', 'm', 'n'}}

回答 1

短一点,至少:

wanted_keys = ['l', 'm', 'n'] # The keys you want
dict((k, bigdict[k]) for k in wanted_keys if k in bigdict)

A bit shorter, at least:

wanted_keys = ['l', 'm', 'n'] # The keys you want
dict((k, bigdict[k]) for k in wanted_keys if k in bigdict)

回答 2

interesting_keys = ('l', 'm', 'n')
subdict = {x: bigdict[x] for x in interesting_keys if x in bigdict}
interesting_keys = ('l', 'm', 'n')
subdict = {x: bigdict[x] for x in interesting_keys if x in bigdict}

回答 3

所有提到的方法的速度比较:

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Jan 29 2016, 14:26:21) [MSC v.1500 64 bit (AMD64)] on win32
In[2]: import numpy.random as nprnd
keys = nprnd.randint(1000, size=10000)
bigdict = dict([(_, nprnd.rand()) for _ in range(1000)])

%timeit {key:bigdict[key] for key in keys}
%timeit dict((key, bigdict[key]) for key in keys)
%timeit dict(map(lambda k: (k, bigdict[k]), keys))
%timeit dict(filter(lambda i:i[0] in keys, bigdict.items()))
%timeit {key:value for key, value in bigdict.items() if key in keys}
100 loops, best of 3: 3.09 ms per loop
100 loops, best of 3: 3.72 ms per loop
100 loops, best of 3: 6.63 ms per loop
10 loops, best of 3: 20.3 ms per loop
100 loops, best of 3: 20.6 ms per loop

不出所料:词典理解是最好的选择。

A bit of speed comparison for all mentioned methods:

UPDATED on 2020.07.13 (thx to @user3780389): ONLY for keys from bigdict.

 IPython 5.5.0 -- An enhanced Interactive Python.
Python 2.7.18 (default, Aug  8 2019, 00:00:00) 
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux2
import numpy.random as nprnd
  ...: keys = nprnd.randint(100000, size=10000)
  ...: bigdict = dict([(_, nprnd.rand()) for _ in range(100000)])
  ...: 
  ...: %timeit {key:bigdict[key] for key in keys}
  ...: %timeit dict((key, bigdict[key]) for key in keys)
  ...: %timeit dict(map(lambda k: (k, bigdict[k]), keys))
  ...: %timeit {key:bigdict[key] for key in set(keys) & set(bigdict.keys())}
  ...: %timeit dict(filter(lambda i:i[0] in keys, bigdict.items()))
  ...: %timeit {key:value for key, value in bigdict.items() if key in keys}
100 loops, best of 3: 2.36 ms per loop
100 loops, best of 3: 2.87 ms per loop
100 loops, best of 3: 3.65 ms per loop
100 loops, best of 3: 7.14 ms per loop
1 loop, best of 3: 577 ms per loop
1 loop, best of 3: 563 ms per loop

As it was expected: dictionary comprehensions are the best option.


回答 4

该答案使用与所选答案相似的字典理解,但除了缺失项外,不会。

python 2版本:

{k:v for k, v in bigDict.iteritems() if k in ('l', 'm', 'n')}

python 3版本:

{k:v for k, v in bigDict.items() if k in ('l', 'm', 'n')}

This answer uses a dictionary comprehension similar to the selected answer, but will not except on a missing item.

python 2 version:

{k:v for k, v in bigDict.iteritems() if k in ('l', 'm', 'n')}

python 3 version:

{k:v for k, v in bigDict.items() if k in ('l', 'm', 'n')}

回答 5

也许:

subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n']])

Python 3甚至支持以下内容:

subdict={a:bigdict[a] for a in ['l','m','n']}

请注意,您可以按照以下步骤检查字典中是否存在:

subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n'] if x in bigdict])

分别 对于python 3

subdict={a:bigdict[a] for a in ['l','m','n'] if a in bigdict}

Maybe:

subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n']])

Python 3 even supports the following:

subdict={a:bigdict[a] for a in ['l','m','n']}

Note that you can check for existence in dictionary as follows:

subdict=dict([(x,bigdict[x]) for x in ['l', 'm', 'n'] if x in bigdict])

resp. for python 3

subdict={a:bigdict[a] for a in ['l','m','n'] if a in bigdict}

回答 6

好的,这让我有些困扰,所以谢谢Jayesh提出这个问题。

上面的答案似乎是一个很好的解决方案,但是如果您在整个代码中都使用此解决方案,则包装功能恕我直言是有意义的。另外,这里有两种可能的用例:一种在用例中,您在乎所有关键字是否都在原始词典中。还有一个你不知道的地方 平等地对待这两者将是很好的。

因此,对于我的两分钱价值,我建议编写一个字典的子类,例如

class my_dict(dict):
    def subdict(self, keywords, fragile=False):
        d = {}
        for k in keywords:
            try:
                d[k] = self[k]
            except KeyError:
                if fragile:
                    raise
        return d

现在,您可以使用

orig_dict.subdict(keywords)

用法示例:

#
## our keywords are letters of the alphabet
keywords = 'abcdefghijklmnopqrstuvwxyz'
#
## our dictionary maps letters to their index
d = my_dict([(k,i) for i,k in enumerate(keywords)])
print('Original dictionary:\n%r\n\n' % (d,))
#
## constructing a sub-dictionary with good keywords
oddkeywords = keywords[::2]
subd = d.subdict(oddkeywords)
print('Dictionary from odd numbered keys:\n%r\n\n' % (subd,))
#
## constructing a sub-dictionary with mixture of good and bad keywords
somebadkeywords = keywords[1::2] + 'A'
try:
    subd2 = d.subdict(somebadkeywords)
    print("We shouldn't see this message")
except KeyError:
    print("subd2 construction fails:")
    print("\toriginal dictionary doesn't contain some keys\n\n")
#
## Trying again with fragile set to false
try:
    subd3 = d.subdict(somebadkeywords, fragile=False)
    print('Dictionary constructed using some bad keys:\n%r\n\n' % (subd3,))
except KeyError:
    print("We shouldn't see this message")

如果运行以上所有代码,则应该看到(类似)以下输出(对不起格式):

原始字典:
{‘a’:0,’c’:2,’b’:1,’e’:4,’d’:3,’g’:6,’f’:5,’i’: 8,’h’:7,’k’:10,’j’:9,’m’:12,’l’:11,’o’:14,’n’:13,’q’:16, ‘p’:15,’s’:18,’r’:17,’u’:20,’t’:19,’w’:22,’v’:21,’y’:24,’x ‘:23,’z’:25}

奇数键字典:
{‘a’:0,’c’:2,’e’:4,’g’:6,’i’:8,’k’:10,’m’:12,’ o’:14,’q’:16,’s’:18,’u’:20,’w’:22,’y’:24}

subd2构造失败:
原始词典不包含某些键

使用一些错误键构造的字典:
{‘b’:1,’d’:3,’f’:5,’h’:7,’j’:9,’l’:11,’n’:13, ‘p’:15,’r’:17,’t’:19,’v’:21,’x’:23,’z’:25}

Okay, this is something that has bothered me a few times, so thank you Jayesh for asking it.

The answers above seem like as good a solution as any, but if you are using this all over your code, it makes sense to wrap the functionality IMHO. Also, there are two possible use cases here: one where you care about whether all keywords are in the original dictionary. and one where you don’t. It would be nice to treat both equally.

So, for my two-penneth worth, I suggest writing a sub-class of dictionary, e.g.

class my_dict(dict):
    def subdict(self, keywords, fragile=False):
        d = {}
        for k in keywords:
            try:
                d[k] = self[k]
            except KeyError:
                if fragile:
                    raise
        return d

Now you can pull out a sub-dictionary with

orig_dict.subdict(keywords)

Usage examples:

#
## our keywords are letters of the alphabet
keywords = 'abcdefghijklmnopqrstuvwxyz'
#
## our dictionary maps letters to their index
d = my_dict([(k,i) for i,k in enumerate(keywords)])
print('Original dictionary:\n%r\n\n' % (d,))
#
## constructing a sub-dictionary with good keywords
oddkeywords = keywords[::2]
subd = d.subdict(oddkeywords)
print('Dictionary from odd numbered keys:\n%r\n\n' % (subd,))
#
## constructing a sub-dictionary with mixture of good and bad keywords
somebadkeywords = keywords[1::2] + 'A'
try:
    subd2 = d.subdict(somebadkeywords)
    print("We shouldn't see this message")
except KeyError:
    print("subd2 construction fails:")
    print("\toriginal dictionary doesn't contain some keys\n\n")
#
## Trying again with fragile set to false
try:
    subd3 = d.subdict(somebadkeywords, fragile=False)
    print('Dictionary constructed using some bad keys:\n%r\n\n' % (subd3,))
except KeyError:
    print("We shouldn't see this message")

If you run all the above code, you should see (something like) the following output (sorry for the formatting):

Original dictionary:
{‘a’: 0, ‘c’: 2, ‘b’: 1, ‘e’: 4, ‘d’: 3, ‘g’: 6, ‘f’: 5, ‘i’: 8, ‘h’: 7, ‘k’: 10, ‘j’: 9, ‘m’: 12, ‘l’: 11, ‘o’: 14, ‘n’: 13, ‘q’: 16, ‘p’: 15, ‘s’: 18, ‘r’: 17, ‘u’: 20, ‘t’: 19, ‘w’: 22, ‘v’: 21, ‘y’: 24, ‘x’: 23, ‘z’: 25}

Dictionary from odd numbered keys:
{‘a’: 0, ‘c’: 2, ‘e’: 4, ‘g’: 6, ‘i’: 8, ‘k’: 10, ‘m’: 12, ‘o’: 14, ‘q’: 16, ‘s’: 18, ‘u’: 20, ‘w’: 22, ‘y’: 24}

subd2 construction fails:
original dictionary doesn’t contain some keys

Dictionary constructed using some bad keys:
{‘b’: 1, ‘d’: 3, ‘f’: 5, ‘h’: 7, ‘j’: 9, ‘l’: 11, ‘n’: 13, ‘p’: 15, ‘r’: 17, ‘t’: 19, ‘v’: 21, ‘x’: 23, ‘z’: 25}


回答 7

您还可以使用map(无论如何,这都是非常有用的功能):

sd = dict(map(lambda k: (k, l.get(k, None)), l))

例:

large_dictionary = {'a1':123, 'a2':45, 'a3':344}
list_of_keys = ['a1', 'a3']
small_dictionary = dict(map(lambda key: (key, large_dictionary.get(key, None)), list_of_keys))

PS:我.get(key, None)从以前的答案中借来了:)

You can also use map (which is a very useful function to get to know anyway):

sd = dict(map(lambda k: (k, l.get(k, None)), l))

Example:

large_dictionary = {'a1':123, 'a2':45, 'a3':344}
list_of_keys = ['a1', 'a3']
small_dictionary = dict(map(lambda key: (key, large_dictionary.get(key, None)), list_of_keys))

PS: I borrowed the .get(key, None) from a previous answer :)


回答 8

还有一个(我更喜欢马克·朗伊尔的回答)

di = {'a':1,'b':2,'c':3}
req = ['a','c','w']
dict([i for i in di.iteritems() if i[0] in di and i[0] in req])

Yet another one (I prefer Mark Longair’s answer)

di = {'a':1,'b':2,'c':3}
req = ['a','c','w']
dict([i for i in di.iteritems() if i[0] in di and i[0] in req])

回答 9

from operator import itemgetter
from typing import List, Dict, Union


def subdict(d: Union[Dict, List], columns: List[str]) -> Union[Dict, List[Dict]]:
    """Return a dict or list of dicts with subset of 
    columns from the d argument.
    """
    getter = itemgetter(*columns)

    if isinstance(d, list):
        result = []
        for subset in map(getter, d):
            record = dict(zip(columns, subset))
            result.append(record)
        return result
    elif isinstance(d, dict):
        return dict(zip(columns, getter(d)))

    raise ValueError('Unsupported type for `d`')

使用实例

# pure dict

d = dict(a=1, b=2, c=3)
print(subdict(d, ['a', 'c']))

>>> In [5]: {'a': 1, 'c': 3}
# list of dicts

d = [
    dict(a=1, b=2, c=3),
    dict(a=2, b=4, c=6),
    dict(a=4, b=8, c=12),
]

print(subdict(d, ['a', 'c']))

>>> In [5]: [{'a': 1, 'c': 3}, {'a': 2, 'c': 6}, {'a': 4, 'c': 12}]

solution

from operator import itemgetter
from typing import List, Dict, Union


def subdict(d: Union[Dict, List], columns: List[str]) -> Union[Dict, List[Dict]]:
    """Return a dict or list of dicts with subset of 
    columns from the d argument.
    """
    getter = itemgetter(*columns)

    if isinstance(d, list):
        result = []
        for subset in map(getter, d):
            record = dict(zip(columns, subset))
            result.append(record)
        return result
    elif isinstance(d, dict):
        return dict(zip(columns, getter(d)))

    raise ValueError('Unsupported type for `d`')

examples of use

# pure dict

d = dict(a=1, b=2, c=3)
print(subdict(d, ['a', 'c']))

>>> In [5]: {'a': 1, 'c': 3}
# list of dicts

d = [
    dict(a=1, b=2, c=3),
    dict(a=2, b=4, c=6),
    dict(a=4, b=8, c=12),
]

print(subdict(d, ['a', 'c']))

>>> In [5]: [{'a': 1, 'c': 3}, {'a': 2, 'c': 6}, {'a': 4, 'c': 12}]

获取与字典中的最小值对应的键

问题:获取与字典中的最小值对应的键

如果我有Python字典,如何获得包含最小值的条目的键?

我正在考虑与该min()功能有关的事情…

给定输入:

{320:1, 321:0, 322:3}

它将返回321

If I have a Python dictionary, how do I get the key to the entry which contains the minimum value?

I was thinking about something to do with the min() function…

Given the input:

{320:1, 321:0, 322:3}

It would return 321.


回答 0

最好:min(d, key=d.get)-没有理由插入无用的lambda间接层或提取项目或密钥!

Best: min(d, key=d.get) — no reason to interpose a useless lambda indirection layer or extract items or keys!


回答 1

这实际上是提供OP所需解决方案的答案:

>>> d = {320:1, 321:0, 322:3}
>>> d.items()
[(320, 1), (321, 0), (322, 3)]
>>> # find the minimum by comparing the second element of each tuple
>>> min(d.items(), key=lambda x: x[1]) 
(321, 0)

d.iteritems()但是,对于较大的词典,使用将更为有效。

Here’s an answer that actually gives the solution the OP asked for:

>>> d = {320:1, 321:0, 322:3}
>>> d.items()
[(320, 1), (321, 0), (322, 3)]
>>> # find the minimum by comparing the second element of each tuple
>>> min(d.items(), key=lambda x: x[1]) 
(321, 0)

Using d.iteritems() will be more efficient for larger dictionaries, however.


回答 2

对于具有相等最小值的多个键,可以使用列表理解:

d = {320:1, 321:0, 322:3, 323:0}

minval = min(d.values())
res = [k for k, v in d.items() if v==minval]

[321, 323]

等效功能版本:

res = list(filter(lambda x: d[x]==minval, d))

For multiple keys which have equal lowest value, you can use a list comprehension:

d = {320:1, 321:0, 322:3, 323:0}

minval = min(d.values())
res = [k for k, v in d.items() if v==minval]

[321, 323]

An equivalent functional version:

res = list(filter(lambda x: d[x]==minval, d))

回答 3

min(d.items(), key=lambda x: x[1])[0]

min(d.items(), key=lambda x: x[1])[0]


回答 4

>>> d = {320:1, 321:0, 322:3}
>>> min(d, key=lambda k: d[k]) 
321
>>> d = {320:1, 321:0, 322:3}
>>> min(d, key=lambda k: d[k]) 
321

回答 5

对于您有多个最小键并希望保持简单的情况

def minimums(some_dict):
    positions = [] # output variable
    min_value = float("inf")
    for k, v in some_dict.items():
        if v == min_value:
            positions.append(k)
        if v < min_value:
            min_value = v
            positions = [] # output variable
            positions.append(k)

    return positions

minimums({'a':1, 'b':2, 'c':-1, 'd':0, 'e':-1})

['e', 'c']

For the case where you have multiple minimal keys and want to keep it simple

def minimums(some_dict):
    positions = [] # output variable
    min_value = float("inf")
    for k, v in some_dict.items():
        if v == min_value:
            positions.append(k)
        if v < min_value:
            min_value = v
            positions = [] # output variable
            positions.append(k)

    return positions

minimums({'a':1, 'b':2, 'c':-1, 'd':0, 'e':-1})

['e', 'c']

回答 6

如果您不确定是否没有多个最小值,我建议:

d = {320:1, 321:0, 322:3, 323:0}
print ', '.join(str(key) for min_value in (min(d.values()),) for key in d if d[key]==min_value)

"""Output:
321, 323
"""

If you are not sure that you have not multiple minimum values, I would suggest:

d = {320:1, 321:0, 322:3, 323:0}
print ', '.join(str(key) for min_value in (min(d.values()),) for key in d if d[key]==min_value)

"""Output:
321, 323
"""

回答 7

编辑:这是OP 关于最小密钥而不是最小答案的原始问题的答案。


您可以使用keys函数获取字典的键,并且正确使用min来查找该列表的最小值。

Edit: this is an answer to the OP’s original question about the minimal key, not the minimal answer.


You can get the keys of the dict using the keys function, and you’re right about using min to find the minimum of that list.


回答 8

解决具有相同最小值的多个键的另一种方法:

>>> dd = {320:1, 321:0, 322:3, 323:0}
>>>
>>> from itertools import groupby
>>> from operator import itemgetter
>>>
>>> print [v for k,v in groupby(sorted((v,k) for k,v in dd.iteritems()), key=itemgetter(0)).next()[1]]
[321, 323]

Another approach to addressing the issue of multiple keys with the same min value:

>>> dd = {320:1, 321:0, 322:3, 323:0}
>>>
>>> from itertools import groupby
>>> from operator import itemgetter
>>>
>>> print [v for k,v in groupby(sorted((v,k) for k,v in dd.iteritems()), key=itemgetter(0)).next()[1]]
[321, 323]

回答 9

使用min与迭代器(对于Python 3使用items代替iteritems); 代替lambda使用itemgetterfrom运算符,它比lambda更快。

from operator import itemgetter
min_key, _ = min(d.iteritems(), key=itemgetter(1))

Use min with an iterator (for python 3 use items instead of iteritems); instead of lambda use the itemgetter from operator, which is faster than lambda.

from operator import itemgetter
min_key, _ = min(d.iteritems(), key=itemgetter(1))

回答 10

d={}
d[320]=1
d[321]=0
d[322]=3
value = min(d.values())
for k in d.keys(): 
    if d[k] == value:
        print k,d[k]
d={}
d[320]=1
d[321]=0
d[322]=3
value = min(d.values())
for k in d.keys(): 
    if d[k] == value:
        print k,d[k]

回答 11

我比较了以下三个选项的执行情况:

    import random, datetime

myDict = {}
for i in range( 10000000 ):
    myDict[ i ] = random.randint( 0, 10000000 )



# OPTION 1

start = datetime.datetime.now()

sorted = []
for i in myDict:
    sorted.append( ( i, myDict[ i ] ) )
sorted.sort( key = lambda x: x[1] )
print( sorted[0][0] )

end = datetime.datetime.now()
print( end - start )



# OPTION 2

start = datetime.datetime.now()

myDict_values = list( myDict.values() )
myDict_keys = list( myDict.keys() )
min_value = min( myDict_values )
print( myDict_keys[ myDict_values.index( min_value ) ] )

end = datetime.datetime.now()
print( end - start )



# OPTION 3

start = datetime.datetime.now()

print( min( myDict, key=myDict.get ) )

end = datetime.datetime.now()
print( end - start )

样本输出:

#option 1
236230
0:00:14.136808

#option 2
236230
0:00:00.458026

#option 3
236230
0:00:00.824048

I compared how the following three options perform:

    import random, datetime

myDict = {}
for i in range( 10000000 ):
    myDict[ i ] = random.randint( 0, 10000000 )



# OPTION 1

start = datetime.datetime.now()

sorted = []
for i in myDict:
    sorted.append( ( i, myDict[ i ] ) )
sorted.sort( key = lambda x: x[1] )
print( sorted[0][0] )

end = datetime.datetime.now()
print( end - start )



# OPTION 2

start = datetime.datetime.now()

myDict_values = list( myDict.values() )
myDict_keys = list( myDict.keys() )
min_value = min( myDict_values )
print( myDict_keys[ myDict_values.index( min_value ) ] )

end = datetime.datetime.now()
print( end - start )



# OPTION 3

start = datetime.datetime.now()

print( min( myDict, key=myDict.get ) )

end = datetime.datetime.now()
print( end - start )

Sample output:

#option 1
236230
0:00:14.136808

#option 2
236230
0:00:00.458026

#option 3
236230
0:00:00.824048

回答 12

要创建可排序的类,您必须重写6个特殊函数,以便min()函数可以调用它

这些方法的__lt__ , __le__, __gt__, __ge__, __eq__ , __ne__顺序是小于,小于或等于,大于,大于或等于,等于,不等于。例如,您应该实现__lt__如下:

def __lt__(self, other):
  return self.comparable_value < other.comparable_value

那么您可以使用min函数,如下所示:

minValue = min(yourList, key=(lambda k: yourList[k]))

这对我有用。

to create an orderable class you have to override 6 special functions, so that it would be called by the min() function

these methods are__lt__ , __le__, __gt__, __ge__, __eq__ , __ne__ in order they are less than, less than or equal, greater than, greater than or equal, equal, not equal. for example you should implement __lt__ as follows:

def __lt__(self, other):
  return self.comparable_value < other.comparable_value

then you can use the min function as follows:

minValue = min(yourList, key=(lambda k: yourList[k]))

this worked for me.


回答 13

min(zip(d.values(), d.keys()))[1]

使用zip函数创建包含值和键的元组的迭代器。然后用min函数包装它,min函数根据第一个键取最小值。这将返回一个包含(值,键)对的元组。索引[1]用于获取对应的密钥

min(zip(d.values(), d.keys()))[1]

Use the zip function to create an iterator of tuples containing values and keys. Then wrap it with a min function which takes the minimum based on the first key. This returns a tuple containing (value, key) pair. The index of [1] is used to get the corresponding key


回答 14

# python 
d={320:1, 321:0, 322:3}
reduce(lambda x,y: x if d[x]<=d[y] else y, d.iterkeys())
  321
# python 
d={320:1, 321:0, 322:3}
reduce(lambda x,y: x if d[x]<=d[y] else y, d.iterkeys())
  321

回答 15

这是你想要的?

d = dict()
d[15.0]='fifteen'
d[14.0]='fourteen'
d[14.5]='fourteenandhalf'

print d[min(d.keys())]

打印“十四”

Is this what you are looking for?

d = dict()
d[15.0]='fifteen'
d[14.0]='fourteen'
d[14.5]='fourteenandhalf'

print d[min(d.keys())]

Prints ‘fourteen’


访问像属性一样的字典键?

问题:访问像属性一样的字典键?

我发现访问dict键obj.foo而不是更为方便obj['foo'],因此我编写了以下代码段:

class AttributeDict(dict):
    def __getattr__(self, attr):
        return self[attr]
    def __setattr__(self, attr, value):
        self[attr] = value

但是,我认为一定有某些原因导致Python无法立即提供此功能。以这种方式访问​​字典键的注意事项和陷阱是什么?

I find it more convenient to access dict keys as obj.foo instead of obj['foo'], so I wrote this snippet:

class AttributeDict(dict):
    def __getattr__(self, attr):
        return self[attr]
    def __setattr__(self, attr, value):
        self[attr] = value

However, I assume that there must be some reason that Python doesn’t provide this functionality out of the box. What would be the caveats and pitfalls of accessing dict keys in this manner?


回答 0

最好的方法是:

class AttrDict(dict):
    def __init__(self, *args, **kwargs):
        super(AttrDict, self).__init__(*args, **kwargs)
        self.__dict__ = self

一些优点:

  • 它实际上有效!
  • 没有字典类方法被遮盖(例如,.keys()工作就很好。除非-当然-您为其分配了一些值,请参见下文)
  • 属性和项目始终保持同步
  • 尝试访问不存在的键作为属性正确引发,AttributeError而不是KeyError

缺点:

  • 如果这样的方法被传入的数据覆盖,它们.keys()无法正常工作
  • 在Python <2.7.4 / Python3 <3.2.3中导致内存泄漏
  • 皮林特(Pylint)E1123(unexpected-keyword-arg)E1103(maybe-no-member)
  • 对于初学者来说,这似乎是纯魔术。

简短说明

  • 所有python对象在内部将其属性存储在名为的字典中__dict__
  • 不需要内部字典__dict__必须是“仅是简单的字典”,因此我们可以将dict()内部字典的任何子类分配给它。
  • 在我们的例子中,我们只需分配要AttrDict()实例化的实例(就像在中一样__init__)。
  • 通过调用super()__init__()方法,我们可以确保它(已经)的行为与字典完全相同,因为该函数将调用所有字典实例化代码。

Python无法立即提供此功能的原因之一

如“ cons”列表中所述,这将存储键的命名空间(可能来自任意和/或不受信任的数据!)与内置dict方法属性的命名空间结合在一起。例如:

d = AttrDict()
d.update({'items':["jacket", "necktie", "trousers"]})
for k, v in d.items():    # TypeError: 'list' object is not callable
    print "Never reached!"

The best way to do this is:

class AttrDict(dict):
    def __init__(self, *args, **kwargs):
        super(AttrDict, self).__init__(*args, **kwargs)
        self.__dict__ = self

Some pros:

  • It actually works!
  • No dictionary class methods are shadowed (e.g. .keys() work just fine. Unless – of course – you assign some value to them, see below)
  • Attributes and items are always in sync
  • Trying to access non-existent key as an attribute correctly raises AttributeError instead of KeyError
  • Supports [Tab] autocompletion (e.g. in jupyter & ipython)

Cons:

  • Methods like .keys() will not work just fine if they get overwritten by incoming data
  • Each AttrDict instance actually stores 2 dictionaries, one inherited and another one in __dict__
  • Causes a memory leak in Python < 2.7.4 / Python3 < 3.2.3
  • Pylint goes bananas with E1123(unexpected-keyword-arg) and E1103(maybe-no-member)
  • For the uninitiated it seems like pure magic.

A short explanation on how this works

  • All python objects internally store their attributes in a dictionary that is named __dict__.
  • There is no requirement that the internal dictionary __dict__ would need to be “just a plain dict”, so we can assign any subclass of dict() to the internal dictionary.
  • In our case we simply assign the AttrDict() instance we are instantiating (as we are in __init__).
  • By calling super()‘s __init__() method we made sure that it (already) behaves exactly like a dictionary, since that function calls all the dictionary instantiation code.

One reason why Python doesn’t provide this functionality out of the box

As noted in the “cons” list, this combines the namespace of stored keys (which may come from arbitrary and/or untrusted data!) with the namespace of builtin dict method attributes. For example:

d = AttrDict()
d.update({'items':["jacket", "necktie", "trousers"]})
for k, v in d.items():    # TypeError: 'list' object is not callable
    print "Never reached!"

Update – 2020

Since this question was asked almost ten years ago, quite a bit has changed in Python itself since then.

While this approach is still valid for some cases, e.g. legacy projects stuck to older versions of Python and cases where you really need to handle dictionaries with very dynamic string keys – I think that in general the dataclasses introduced in Python 3.7 are the obvious/correct solution to vast majority of the use cases of AttrDict.


回答 1

如果使用数组表示法,则可以将所有合法字符串字符作为键的一部分。例如,obj['!#$%^&*()_']

You can have all legal string characters as part of the key if you use array notation. For example, obj['!#$%^&*()_']


回答 2

另一个SO问题中,有一个很好的实现示例,可以简化您的现有代码。怎么样:

class AttributeDict(dict): 
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__

更简洁,不留任何余地额外的克鲁夫特进入你__getattr____setattr__功能的未来。

From This other SO question there’s a great implementation example that simplifies your existing code. How about:

class AttributeDict(dict):
    __slots__ = () 
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__

Much more concise and doesn’t leave any room for extra cruft getting into your __getattr__ and __setattr__ functions in the future.


回答 3

我在哪里回答所问的问题

为什么Python不提供开箱即用的功能?

我怀疑这与PythonZen有关:“应该有一种-最好只有一种-显而易见的方法。” 这将创建两种显而易见的方式来访问字典中的值:obj['key']obj.key

注意事项和陷阱

这些可能包括代码不够清晰和混乱。也就是说,以下内容可能会使以后打算维护您代码的其他人感到困惑,甚至如果您暂时不使用它,也可能会使您感到困惑。再次,来自禅宗:“可读性很重要!”

>>> KEY = 'spam'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1

如果d被实例化 KEY被定义或被 d[KEY]分配为远离d.spam使用的地方,则它很容易导致对正在执行的操作感到困惑,因为这不是常用的习惯用法。我知道这可能会使我感到困惑。

另外,如果您KEY按如下方式更改值(但未更改d.spam),则您将获得:

>>> KEY = 'foo'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: 'C' object has no attribute 'spam'

海事组织,不值得付出努力。

其他项目

正如其他人指出的那样,您可以使用任何可哈希对象(不仅仅是字符串)作为dict键。例如,

>>> d = {(2, 3): True,}
>>> assert d[(2, 3)] is True
>>> 

是合法的,但是

>>> C = type('C', (object,), {(2, 3): True})
>>> d = C()
>>> assert d.(2, 3) is True
  File "<stdin>", line 1
  d.(2, 3)
    ^
SyntaxError: invalid syntax
>>> getattr(d, (2, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
>>> 

不是。这使您可以访问字典键的所有可打印字符或其他可哈希对象的范围,而访问对象属性时则没有这些范围。这使诸如缓存对象元类之类的魔术成为可能,例如Python Cookbook(第9章)中的配方。

我在其中编辑

我更喜欢的美学spam.eggsspam['eggs'](我认为它看起来更清洁),我真的开始渴望这个功能时,我遇到了namedtuple。但是能够执行以下操作的便利性胜过它。

>>> KEYS = 'spam eggs ham'
>>> VALS = [1, 2, 3]
>>> d = {k: v for k, v in zip(KEYS.split(' '), VALS)}
>>> assert d == {'spam': 1, 'eggs': 2, 'ham': 3}
>>>

这是一个简单的示例,但是我经常发现自己在不同情况下使用dict而不是使用obj.key符号(即,当我需要从XML文件读取首选项时)。在其他情况下,出于美学原因,我倾向于实例化动态类并在其上添加一些属性,我将继续使用dict来保持一致性,以增强可读性。

我确信OP早就解决了这个问题,使他满意,但是如果他仍然想要此功能,那么我建议他从pypi下载提供该功能的软件包之一:

  • 是我更熟悉的一种。的子类dict,因此您具有所有功能。
  • AttrDict看起来也很不错,但是我并不熟悉它,也没有像 Bunch那样详细地浏览源代码。
  • Addict会得到积极维护,并提供类似attr的访问权限。
  • 如Rotareti的评论所述,Bunch已过时,但有一个名为Munch的活动叉子。

但是,为了提高代码的可读性,我强烈建议他不要混合使用自己的符号样式。如果他喜欢这种表示法,那么他应该简单地实例化一个动态对象,为其添加所需的属性,然后将其命名为day:

>>> C = type('C', (object,), {})
>>> d = C()
>>> d.spam = 1
>>> d.eggs = 2
>>> d.ham = 3
>>> assert d.__dict__ == {'spam': 1, 'eggs': 2, 'ham': 3}


我在其中更新,以在评论中回答后续问题

在下面的评论中,Elmo问:

如果您想更深入一点怎么办?(指type(…))

尽管我从未使用过这种用例(再次dict,为了保持一致性,我倾向于使用nested ),但是以下代码可以工作:

>>> C = type('C', (object,), {})
>>> d = C()
>>> for x in 'spam eggs ham'.split():
...     setattr(d, x, C())
...     i = 1
...     for y in 'one two three'.split():
...         setattr(getattr(d, x), y, i)
...         i += 1
...
>>> assert d.spam.__dict__ == {'one': 1, 'two': 2, 'three': 3}

Wherein I Answer the Question That Was Asked

Why doesn’t Python offer it out of the box?

I suspect that it has to do with the Zen of Python: “There should be one — and preferably only one — obvious way to do it.” This would create two obvious ways to access values from dictionaries: obj['key'] and obj.key.

Caveats and Pitfalls

These include possible lack of clarity and confusion in the code. i.e., the following could be confusing to someone else who is going in to maintain your code at a later date, or even to you, if you’re not going back into it for awhile. Again, from Zen: “Readability counts!”

>>> KEY = 'spam'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1

If d is instantiated or KEY is defined or d[KEY] is assigned far away from where d.spam is being used, it can easily lead to confusion about what’s being done, since this isn’t a commonly-used idiom. I know it would have the potential to confuse me.

Additonally, if you change the value of KEY as follows (but miss changing d.spam), you now get:

>>> KEY = 'foo'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: 'C' object has no attribute 'spam'

IMO, not worth the effort.

Other Items

As others have noted, you can use any hashable object (not just a string) as a dict key. For example,

>>> d = {(2, 3): True,}
>>> assert d[(2, 3)] is True
>>> 

is legal, but

>>> C = type('C', (object,), {(2, 3): True})
>>> d = C()
>>> assert d.(2, 3) is True
  File "<stdin>", line 1
  d.(2, 3)
    ^
SyntaxError: invalid syntax
>>> getattr(d, (2, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
>>> 

is not. This gives you access to the entire range of printable characters or other hashable objects for your dictionary keys, which you do not have when accessing an object attribute. This makes possible such magic as a cached object metaclass, like the recipe from the Python Cookbook (Ch. 9).

Wherein I Editorialize

I prefer the aesthetics of spam.eggs over spam['eggs'] (I think it looks cleaner), and I really started craving this functionality when I met the namedtuple. But the convenience of being able to do the following trumps it.

>>> KEYS = 'spam eggs ham'
>>> VALS = [1, 2, 3]
>>> d = {k: v for k, v in zip(KEYS.split(' '), VALS)}
>>> assert d == {'spam': 1, 'eggs': 2, 'ham': 3}
>>>

This is a simple example, but I frequently find myself using dicts in different situations than I’d use obj.key notation (i.e., when I need to read prefs in from an XML file). In other cases, where I’m tempted to instantiate a dynamic class and slap some attributes on it for aesthetic reasons, I continue to use a dict for consistency in order to enhance readability.

I’m sure the OP has long-since resolved this to his satisfaction, but if he still wants this functionality, then I suggest he download one of the packages from pypi that provides it:

  • Bunch is the one I’m more familiar with. Subclass of dict, so you have all that functionality.
  • AttrDict also looks like it’s also pretty good, but I’m not as familiar with it and haven’t looked through the source in as much detail as I have Bunch.
  • Addict Is actively maintained and provides attr-like access and more.
  • As noted in the comments by Rotareti, Bunch has been deprecated, but there is an active fork called Munch.

However, in order to improve readability of his code I strongly recommend that he not mix his notation styles. If he prefers this notation then he should simply instantiate a dynamic object, add his desired attributes to it, and call it a day:

>>> C = type('C', (object,), {})
>>> d = C()
>>> d.spam = 1
>>> d.eggs = 2
>>> d.ham = 3
>>> assert d.__dict__ == {'spam': 1, 'eggs': 2, 'ham': 3}


Wherein I Update, to Answer a Follow-Up Question in the Comments

In the comments (below), Elmo asks:

What if you want to go one deeper? ( referring to type(…) )

While I’ve never used this use case (again, I tend to use nested dict, for consistency), the following code works:

>>> C = type('C', (object,), {})
>>> d = C()
>>> for x in 'spam eggs ham'.split():
...     setattr(d, x, C())
...     i = 1
...     for y in 'one two three'.split():
...         setattr(getattr(d, x), y, i)
...         i += 1
...
>>> assert d.spam.__dict__ == {'one': 1, 'two': 2, 'three': 3}

回答 4

注意:由于某些原因,像这样的类似乎破坏了多处理程序包。在找到该SO之前,我花了一段时间来解决这个错误: 在python多处理中查找异常

Caveat emptor: For some reasons classes like this seem to break the multiprocessing package. I just struggled with this bug for awhile before finding this SO: Finding exception in python multiprocessing


回答 5

您可以从标准库中提取一个方便的容器类:

from argparse import Namespace

避免必须复制代码位。没有标准的字典访问权限,但是如果您真的想要的话,很容易找回。argparse中的代码很简单,

class Namespace(_AttributeHolder):
    """Simple object for storing attributes.

    Implements equality by attribute names and values, and provides a simple
    string representation.
    """

    def __init__(self, **kwargs):
        for name in kwargs:
            setattr(self, name, kwargs[name])

    __hash__ = None

    def __eq__(self, other):
        return vars(self) == vars(other)

    def __ne__(self, other):
        return not (self == other)

    def __contains__(self, key):
        return key in self.__dict__

You can pull a convenient container class from the standard library:

from argparse import Namespace

to avoid having to copy around code bits. No standard dictionary access, but easy to get one back if you really want it. The code in argparse is simple,

class Namespace(_AttributeHolder):
    """Simple object for storing attributes.

    Implements equality by attribute names and values, and provides a simple
    string representation.
    """

    def __init__(self, **kwargs):
        for name in kwargs:
            setattr(self, name, kwargs[name])

    __hash__ = None

    def __eq__(self, other):
        return vars(self) == vars(other)

    def __ne__(self, other):
        return not (self == other)

    def __contains__(self, key):
        return key in self.__dict__

回答 6

如果您想要一个作为方法的键,例如__eq__或,该__getattr__怎么办?

而且,您将无法输入不以字母开头的条目,因此无法0343853用作键。

如果不想使用字符串怎么办?

What if you wanted a key which was a method, such as __eq__ or __getattr__?

And you wouldn’t be able to have an entry that didn’t start with a letter, so using 0343853 as a key is out.

And what if you didn’t want to use a string?


回答 7

元组可以使用dict键。您将如何访问构造中的元组?

同样,namedtuple是一种方便的结构,可以通过属性访问提供值。

tuples can be used dict keys. How would you access tuple in your construct?

Also, namedtuple is a convenient structure which can provide values via the attribute access.


回答 8

怎么样Prodict我写来统治它们的小Python类:)

另外,您将获得自动代码完成递归对象实例化自动类型转换

您可以完全按照要求进行:

p = Prodict()
p.foo = 1
p.bar = "baz"

示例1:类型提示

class Country(Prodict):
    name: str
    population: int

turkey = Country()
turkey.name = 'Turkey'
turkey.population = 79814871

自动编码完成

示例2:自动类型转换

germany = Country(name='Germany', population='82175700', flag_colors=['black', 'red', 'yellow'])

print(germany.population)  # 82175700
print(type(germany.population))  # <class 'int'>

print(germany.flag_colors)  # ['black', 'red', 'yellow']
print(type(germany.flag_colors))  # <class 'list'>

How about Prodict, the little Python class that I wrote to rule them all:)

Plus, you get auto code completion, recursive object instantiations and auto type conversion!

You can do exactly what you asked for:

p = Prodict()
p.foo = 1
p.bar = "baz"

Example 1: Type hinting

class Country(Prodict):
    name: str
    population: int

turkey = Country()
turkey.name = 'Turkey'
turkey.population = 79814871

auto code complete

Example 2: Auto type conversion

germany = Country(name='Germany', population='82175700', flag_colors=['black', 'red', 'yellow'])

print(germany.population)  # 82175700
print(type(germany.population))  # <class 'int'>

print(germany.flag_colors)  # ['black', 'red', 'yellow']
print(type(germany.flag_colors))  # <class 'list'>

回答 9

一般而言,它不起作用。并非所有有效的dict键都具有可寻址属性(“键”)。因此,您需要小心。

Python对象基本上都是字典。因此,我怀疑会有很多性能或其他损失。

It doesn’t work in generality. Not all valid dict keys make addressable attributes (“the key”). So, you’ll need to be careful.

Python objects are all basically dictionaries. So I doubt there is much performance or other penalty.


回答 10

这并没有解决最初的问题,但是对于像我这样在寻找提供此功能的库时到此结束的人很有用。

冰火它是为这个伟大的lib:https://github.com/mewwts/addict需要在前面的答案中提到的许多问题护理。

来自文档的示例:

body = {
    'query': {
        'filtered': {
            'query': {
                'match': {'description': 'addictive'}
            },
            'filter': {
                'term': {'created_by': 'Mats'}
            }
        }
    }
}

与瘾君子:

from addict import Dict
body = Dict()
body.query.filtered.query.match.description = 'addictive'
body.query.filtered.filter.term.created_by = 'Mats'

This doesn’t address the original question, but should be useful for people that, like me, end up here when looking for a lib that provides this functionality.

Addict it’s a great lib for this: https://github.com/mewwts/addict it takes care of many concerns mentioned in previous answers.

An example from the docs:

body = {
    'query': {
        'filtered': {
            'query': {
                'match': {'description': 'addictive'}
            },
            'filter': {
                'term': {'created_by': 'Mats'}
            }
        }
    }
}

With addict:

from addict import Dict
body = Dict()
body.query.filtered.query.match.description = 'addictive'
body.query.filtered.filter.term.created_by = 'Mats'

回答 11

我发现自己想知道python生态系统中“ dict keys as attr”的当前状态是什么。正如一些评论者所指出的那样,这可能不是您想要从头开始的事情,因为存在一些陷阱和脚枪,其中一些非常隐蔽。另外,我不建议将其Namespace用作基类,因为我一直走这条路,这并不漂亮。

幸运的是,有几个提供此功能的开源软件包,准备点安装!不幸的是,有几个软件包。简介,截至2019年12月。

竞争者(最近提交给master | #commits | #contribs | coverage%):

  • 瘾君子 (2019-04-28 | 217 | 22 | 100%)
  • 蒙克(2019年12月16日| 160 | 17 |?%)
  • easydict(2018-10-18 | 51 | 6 |?%)
  • attrdict(2019-02-01 | 108 | 5 | 100%)
  • prodict (2019年10月1日| 65 | 1 |?%)

不再维护或维护不足:

  • treedict(2014-03-28 | 95 | 2 |?%)
  • 一堆(2012-03-12 | 20 | 2 |?%)
  • NeoBunch

我目前建议吃午饭上瘾。他们拥有最多的提交,贡献者和发布,建议为每个构建一个健康的开源代码库。它们具有最干净的readme.md,100%的覆盖率和良好的测试集。

除了滚动我自己的dict / attr代码并浪费大量时间,因为我不知道所有这些选择之外,我在这场比赛中没有一只狗(到目前为止!)。将来我可能会为瘾君子/饥饿做贡献,因为我宁愿看到一个坚固的包装,也不愿看到一堆零散的包装。如果您喜欢它们,请贡献力量!特别是,看起来像munch可以使用codecov徽章,而上瘾者可以使用python版本的徽章。

瘾君子的优点:

  • 递归初始化(foo.abc =’bar’),类似dict的参数会上瘾。

瘾君子的缺点:

  • 阴影,typing.Dict如果你from addict import Dict
  • 没有密钥检查。由于允许递归初始化,因此如果您拼写错误的键,则只需创建一个新属性,而不是KeyError(感谢AljoSt)

嚼劲:

  • 独特的命名
  • JSON和YAML的内置ser / de函数

缺点:

  • 没有递归初始化/一次只能初始化一个attr

我在其中编辑

许多月前,当我使用文本编辑器在只有我自己或另一个开发人员的项目上编写python时,我喜欢dict-attrs的样式,即只需声明即可插入键foo.bar.spam = eggs。现在,我在团队中工作,并使用IDE进行所有操作,而我通常已经从这类数据结构和动态类型中移开了,而转向了静态分析,功能技术和类型提示。我已经开始尝试这种技术,并使用我自己设计的对象将Pstruct子类化:

class  BasePstruct(dict):
    def __getattr__(self, name):
        if name in self.__slots__:
            return self[name]
        return self.__getattribute__(name)

    def __setattr__(self, key, value):
        if key in self.__slots__:
            self[key] = value
            return
        if key in type(self).__dict__:
            self[key] = value
            return
        raise AttributeError(
            "type object '{}' has no attribute '{}'".format(type(self).__name__, key))


class FooPstruct(BasePstruct):
    __slots__ = ['foo', 'bar']

这为您提供了一个对象,其行为仍然像dict,但还使您可以更严格的方式访问诸如属性之类的键。这样做的好处是我(或代码的不幸使用者)确切知道哪些字段可以存在和不存在,并且IDE可以自动完成字段。子类化香草也dict意味着json序列化很容易。我认为这种想法的下一个发展将是一个自定义的protobuf生成器,它会发出这些接口,并且一个不错的替代方法是,您几乎可以免费通过gRPC获得跨语言的数据结构和IPC。

如果您决定采用attr-dict,则必须记录下期望的字段,以确保您自己(以及队友)的理智。

随时编辑/更新此帖子以保持最新!

I found myself wondering what the current state of “dict keys as attr” in the python ecosystem. As several commenters have pointed out, this is probably not something you want to roll your own from scratch, as there are several pitfalls and footguns, some of them very subtle. Also, I would not recommend using Namespace as a base class, I’ve been down that road, it isn’t pretty.

Fortunately, there are several open source packages providing this functionality, ready to pip install! Unfortunately, there are several packages. Here is a synopsis, as of Dec 2019.

Contenders (most recent commit to master|#commits|#contribs|coverage%):

  • addict (2019-04-28 | 217 | 22 | 100%)
  • munch (2019-12-16 | 160 | 17 | ?%)
  • easydict (2018-10-18 | 51 | 6 | ?%)
  • attrdict (2019-02-01 | 108 | 5 | 100%)
  • prodict (2019-10-01 | 65 | 1 | ?%)

No longer maintained or under-maintained:

  • treedict (2014-03-28 | 95 | 2 | ?%)
  • bunch (2012-03-12 | 20 | 2 | ?%)
  • NeoBunch

I currently recommend munch or addict. They have the most commits, contributors, and releases, suggesting a healthy open-source codebase for each. They have the cleanest-looking readme.md, 100% coverage, and good looking set of tests.

I do not have a dog in this race (for now!), besides having rolled my own dict/attr code and wasted a ton of time because I was not aware of all these options :). I may contribute to addict/munch in the future as I would rather see one solid package than a bunch of fragmented ones. If you like them, contribute! In particular, looks like munch could use a codecov badge and addict could use a python version badge.

addict pros:

  • recursive initialization (foo.a.b.c = ‘bar’), dict-like arguments become addict.Dict

addict cons:

  • shadows typing.Dict if you from addict import Dict
  • No key checking. Due to allowing recursive init, if you misspell a key, you just create a new attribute, rather than KeyError (thanks AljoSt)

munch pros:

  • unique naming
  • built-in ser/de functions for JSON and YAML

munch cons:

  • no recursive init / only can init one attr at a time

Wherein I Editorialize

Many moons ago, when I used text editors to write python, on projects with only myself or one other dev, I liked the style of dict-attrs, the ability to insert keys by just declaring foo.bar.spam = eggs. Now I work on teams, and use an IDE for everything, and I have drifted away from these sorts of data structures and dynamic typing in general, in favor of static analysis, functional techniques and type hints. I’ve started experimenting with this technique, subclassing Pstruct with objects of my own design:

class  BasePstruct(dict):
    def __getattr__(self, name):
        if name in self.__slots__:
            return self[name]
        return self.__getattribute__(name)

    def __setattr__(self, key, value):
        if key in self.__slots__:
            self[key] = value
            return
        if key in type(self).__dict__:
            self[key] = value
            return
        raise AttributeError(
            "type object '{}' has no attribute '{}'".format(type(self).__name__, key))


class FooPstruct(BasePstruct):
    __slots__ = ['foo', 'bar']

This gives you an object which still behaves like a dict, but also lets you access keys like attributes, in a much more rigid fashion. The advantage here is I (or the hapless consumers of your code) know exactly what fields can and can’t exist, and the IDE can autocomplete fields. Also subclassing vanilla dict means json serialization is easy. I think the next evolution in this idea would be a custom protobuf generator which emits these interfaces, and a nice knock-on is you get cross-language data structures and IPC via gRPC for nearly free.

If you do decide to go with attr-dicts, it’s essential to document what fields are expected, for your own (and your teammates’) sanity.

Feel free to edit/update this post to keep it recent!


回答 12

这是使用内置的不可变记录的简短示例collections.namedtuple

def record(name, d):
    return namedtuple(name, d.keys())(**d)

和用法示例:

rec = record('Model', {
    'train_op': train_op,
    'loss': loss,
})

print rec.loss(..)

Here’s a short example of immutable records using built-in collections.namedtuple:

def record(name, d):
    return namedtuple(name, d.keys())(**d)

and a usage example:

rec = record('Model', {
    'train_op': train_op,
    'loss': loss,
})

print rec.loss(..)

回答 13

只是为了增加答案的多样性,sci-kit learning已将其实现为Bunch

class Bunch(dict):                                                              
    """ Scikit Learn's container object                                         

    Dictionary-like object that exposes its keys as attributes.                 
    >>> b = Bunch(a=1, b=2)                                                     
    >>> b['b']                                                                  
    2                                                                           
    >>> b.b                                                                     
    2                                                                           
    >>> b.c = 6                                                                 
    >>> b['c']                                                                  
    6                                                                           
    """                                                                         

    def __init__(self, **kwargs):                                               
        super(Bunch, self).__init__(kwargs)                                     

    def __setattr__(self, key, value):                                          
        self[key] = value                                                       

    def __dir__(self):                                                          
        return self.keys()                                                      

    def __getattr__(self, key):                                                 
        try:                                                                    
            return self[key]                                                    
        except KeyError:                                                        
            raise AttributeError(key)                                           

    def __setstate__(self, state):                                              
        pass                       

您所需要做的就是获取setattrgetattr方法- getattr检查dict键,然后继续检查实际属性。这setstaet是用于酸洗/解开“束”的修复程序-如果您感兴趣,请查看https://github.com/scikit-learn/scikit-learn/issues/6196

Just to add some variety to the answer, sci-kit learn has this implemented as a Bunch:

class Bunch(dict):                                                              
    """ Scikit Learn's container object                                         

    Dictionary-like object that exposes its keys as attributes.                 
    >>> b = Bunch(a=1, b=2)                                                     
    >>> b['b']                                                                  
    2                                                                           
    >>> b.b                                                                     
    2                                                                           
    >>> b.c = 6                                                                 
    >>> b['c']                                                                  
    6                                                                           
    """                                                                         

    def __init__(self, **kwargs):                                               
        super(Bunch, self).__init__(kwargs)                                     

    def __setattr__(self, key, value):                                          
        self[key] = value                                                       

    def __dir__(self):                                                          
        return self.keys()                                                      

    def __getattr__(self, key):                                                 
        try:                                                                    
            return self[key]                                                    
        except KeyError:                                                        
            raise AttributeError(key)                                           

    def __setstate__(self, state):                                              
        pass                       

All you need is to get the setattr and getattr methods – the getattr checks for dict keys and the moves on to checking for actual attributes. The setstaet is a fix for fix for pickling/unpickling “bunches” – if inerested check https://github.com/scikit-learn/scikit-learn/issues/6196


回答 14

无需编写自己的 setattr()和getattr()即可。

类对象的优势可能在类定义和继承中发挥了作用。

No need to write your own as setattr() and getattr() already exist.

The advantage of class objects probably comes into play in class definition and inheritance.


回答 15

我是根据该线程的输入创建的。不过,我需要使用odict,因此必须重写get和set attr。我认为这应适用于大多数特殊用途。

用法如下所示:

# Create an ordered dict normally...
>>> od = OrderedAttrDict()
>>> od["a"] = 1
>>> od["b"] = 2
>>> od
OrderedAttrDict([('a', 1), ('b', 2)])

# Get and set data using attribute access...
>>> od.a
1
>>> od.b = 20
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])

# Setting a NEW attribute only creates it on the instance, not the dict...
>>> od.c = 8
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])
>>> od.c
8

Class:

class OrderedAttrDict(odict.OrderedDict):
    """
    Constructs an odict.OrderedDict with attribute access to data.

    Setting a NEW attribute only creates it on the instance, not the dict.
    Setting an attribute that is a key in the data will set the dict data but 
    will not create a new instance attribute
    """
    def __getattr__(self, attr):
        """
        Try to get the data. If attr is not a key, fall-back and get the attr
        """
        if self.has_key(attr):
            return super(OrderedAttrDict, self).__getitem__(attr)
        else:
            return super(OrderedAttrDict, self).__getattr__(attr)


    def __setattr__(self, attr, value):
        """
        Try to set the data. If attr is not a key, fall-back and set the attr
        """
        if self.has_key(attr):
            super(OrderedAttrDict, self).__setitem__(attr, value)
        else:
            super(OrderedAttrDict, self).__setattr__(attr, value)

这是线程中已经提到的非常酷的模式,但是如果您只想接受一个dict并将其转换为可以在IDE中自动完成的对象,等等:

class ObjectFromDict(object):
    def __init__(self, d):
        self.__dict__ = d

I created this based on the input from this thread. I need to use odict though, so I had to override get and set attr. I think this should work for the majority of special uses.

Usage looks like this:

# Create an ordered dict normally...
>>> od = OrderedAttrDict()
>>> od["a"] = 1
>>> od["b"] = 2
>>> od
OrderedAttrDict([('a', 1), ('b', 2)])

# Get and set data using attribute access...
>>> od.a
1
>>> od.b = 20
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])

# Setting a NEW attribute only creates it on the instance, not the dict...
>>> od.c = 8
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])
>>> od.c
8

The class:

class OrderedAttrDict(odict.OrderedDict):
    """
    Constructs an odict.OrderedDict with attribute access to data.

    Setting a NEW attribute only creates it on the instance, not the dict.
    Setting an attribute that is a key in the data will set the dict data but 
    will not create a new instance attribute
    """
    def __getattr__(self, attr):
        """
        Try to get the data. If attr is not a key, fall-back and get the attr
        """
        if self.has_key(attr):
            return super(OrderedAttrDict, self).__getitem__(attr)
        else:
            return super(OrderedAttrDict, self).__getattr__(attr)


    def __setattr__(self, attr, value):
        """
        Try to set the data. If attr is not a key, fall-back and set the attr
        """
        if self.has_key(attr):
            super(OrderedAttrDict, self).__setitem__(attr, value)
        else:
            super(OrderedAttrDict, self).__setattr__(attr, value)

This is a pretty cool pattern already mentioned in the thread, but if you just want to take a dict and convert it to an object that works with auto-complete in an IDE, etc:

class ObjectFromDict(object):
    def __init__(self, d):
        self.__dict__ = d

回答 16

显然,现在有这个图书馆- https://pypi.python.org/pypi/attrdict -它实现了这个确切的功能加上递归合并和JSON负载。可能值得一看。

Apparently there is now a library for this – https://pypi.python.org/pypi/attrdict – which implements this exact functionality plus recursive merging and json loading. Might be worth a look.


回答 17

这就是我用的

args = {
        'batch_size': 32,
        'workers': 4,
        'train_dir': 'train',
        'val_dir': 'val',
        'lr': 1e-3,
        'momentum': 0.9,
        'weight_decay': 1e-4
    }
args = namedtuple('Args', ' '.join(list(args.keys())))(**args)

print (args.lr)

This is what I use

args = {
        'batch_size': 32,
        'workers': 4,
        'train_dir': 'train',
        'val_dir': 'val',
        'lr': 1e-3,
        'momentum': 0.9,
        'weight_decay': 1e-4
    }
args = namedtuple('Args', ' '.join(list(args.keys())))(**args)

print (args.lr)

回答 18

您可以使用我刚刚制作的此类来做。通过此类,您可以Map像其他字典(包括json序列化)一样使用该对象,也可以使用点符号。希望对您有帮助:

class Map(dict):
    """
    Example:
    m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
    """
    def __init__(self, *args, **kwargs):
        super(Map, self).__init__(*args, **kwargs)
        for arg in args:
            if isinstance(arg, dict):
                for k, v in arg.iteritems():
                    self[k] = v

        if kwargs:
            for k, v in kwargs.iteritems():
                self[k] = v

    def __getattr__(self, attr):
        return self.get(attr)

    def __setattr__(self, key, value):
        self.__setitem__(key, value)

    def __setitem__(self, key, value):
        super(Map, self).__setitem__(key, value)
        self.__dict__.update({key: value})

    def __delattr__(self, item):
        self.__delitem__(item)

    def __delitem__(self, key):
        super(Map, self).__delitem__(key)
        del self.__dict__[key]

用法示例:

m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
# Add new key
m.new_key = 'Hello world!'
print m.new_key
print m['new_key']
# Update values
m.new_key = 'Yay!'
# Or
m['new_key'] = 'Yay!'
# Delete key
del m.new_key
# Or
del m['new_key']

You can do it using this class I just made. With this class you can use the Map object like another dictionary(including json serialization) or with the dot notation. I hope help you:

class Map(dict):
    """
    Example:
    m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
    """
    def __init__(self, *args, **kwargs):
        super(Map, self).__init__(*args, **kwargs)
        for arg in args:
            if isinstance(arg, dict):
                for k, v in arg.iteritems():
                    self[k] = v

        if kwargs:
            for k, v in kwargs.iteritems():
                self[k] = v

    def __getattr__(self, attr):
        return self.get(attr)

    def __setattr__(self, key, value):
        self.__setitem__(key, value)

    def __setitem__(self, key, value):
        super(Map, self).__setitem__(key, value)
        self.__dict__.update({key: value})

    def __delattr__(self, item):
        self.__delitem__(item)

    def __delitem__(self, key):
        super(Map, self).__delitem__(key)
        del self.__dict__[key]

Usage examples:

m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
# Add new key
m.new_key = 'Hello world!'
print m.new_key
print m['new_key']
# Update values
m.new_key = 'Yay!'
# Or
m['new_key'] = 'Yay!'
# Delete key
del m.new_key
# Or
del m['new_key']

回答 19

让我发布另一个实现,该实现基于Kinvais的答案,但整合了http://databio.org/posts/python_AttributeDict.html中提出的AttributeDict的思想。

这个版本的优点是它也适用于嵌套字典:

class AttrDict(dict):
    """
    A class to convert a nested Dictionary into an object with key-values
    that are accessible using attribute notation (AttrDict.attribute) instead of
    key notation (Dict["key"]). This class recursively sets Dicts to objects,
    allowing you to recurse down nested dicts (like: AttrDict.attr.attr)
    """

    # Inspired by:
    # http://stackoverflow.com/a/14620633/1551810
    # http://databio.org/posts/python_AttributeDict.html

    def __init__(self, iterable, **kwargs):
        super(AttrDict, self).__init__(iterable, **kwargs)
        for key, value in iterable.items():
            if isinstance(value, dict):
                self.__dict__[key] = AttrDict(value)
            else:
                self.__dict__[key] = value

Let me post another implementation, which builds upon the answer of Kinvais, but integrates ideas from the AttributeDict proposed in http://databio.org/posts/python_AttributeDict.html.

The advantage of this version is that it also works for nested dictionaries:

class AttrDict(dict):
    """
    A class to convert a nested Dictionary into an object with key-values
    that are accessible using attribute notation (AttrDict.attribute) instead of
    key notation (Dict["key"]). This class recursively sets Dicts to objects,
    allowing you to recurse down nested dicts (like: AttrDict.attr.attr)
    """

    # Inspired by:
    # http://stackoverflow.com/a/14620633/1551810
    # http://databio.org/posts/python_AttributeDict.html

    def __init__(self, iterable, **kwargs):
        super(AttrDict, self).__init__(iterable, **kwargs)
        for key, value in iterable.items():
            if isinstance(value, dict):
                self.__dict__[key] = AttrDict(value)
            else:
                self.__dict__[key] = value

回答 20

class AttrDict(dict):

     def __init__(self):
           self.__dict__ = self

if __name__ == '____main__':

     d = AttrDict()
     d['ray'] = 'hope'
     d.sun = 'shine'  >>> Now we can use this . notation
     print d['ray']
     print d.sun
class AttrDict(dict):

     def __init__(self):
           self.__dict__ = self

if __name__ == '____main__':

     d = AttrDict()
     d['ray'] = 'hope'
     d.sun = 'shine'  >>> Now we can use this . notation
     print d['ray']
     print d.sun

回答 21

解决方法是:

DICT_RESERVED_KEYS = vars(dict).keys()


class SmartDict(dict):
    """
    A Dict which is accessible via attribute dot notation
    """
    def __init__(self, *args, **kwargs):
        """
        :param args: multiple dicts ({}, {}, ..)
        :param kwargs: arbitrary keys='value'

        If ``keyerror=False`` is passed then not found attributes will
        always return None.
        """
        super(SmartDict, self).__init__()
        self['__keyerror'] = kwargs.pop('keyerror', True)
        [self.update(arg) for arg in args if isinstance(arg, dict)]
        self.update(kwargs)

    def __getattr__(self, attr):
        if attr not in DICT_RESERVED_KEYS:
            if self['__keyerror']:
                return self[attr]
            else:
                return self.get(attr)
        return getattr(self, attr)

    def __setattr__(self, key, value):
        if key in DICT_RESERVED_KEYS:
            raise AttributeError("You cannot set a reserved name as attribute")
        self.__setitem__(key, value)

    def __copy__(self):
        return self.__class__(self)

    def copy(self):
        return self.__copy__()

Solution is:

DICT_RESERVED_KEYS = vars(dict).keys()


class SmartDict(dict):
    """
    A Dict which is accessible via attribute dot notation
    """
    def __init__(self, *args, **kwargs):
        """
        :param args: multiple dicts ({}, {}, ..)
        :param kwargs: arbitrary keys='value'

        If ``keyerror=False`` is passed then not found attributes will
        always return None.
        """
        super(SmartDict, self).__init__()
        self['__keyerror'] = kwargs.pop('keyerror', True)
        [self.update(arg) for arg in args if isinstance(arg, dict)]
        self.update(kwargs)

    def __getattr__(self, attr):
        if attr not in DICT_RESERVED_KEYS:
            if self['__keyerror']:
                return self[attr]
            else:
                return self.get(attr)
        return getattr(self, attr)

    def __setattr__(self, key, value):
        if key in DICT_RESERVED_KEYS:
            raise AttributeError("You cannot set a reserved name as attribute")
        self.__setitem__(key, value)

    def __copy__(self):
        return self.__class__(self)

    def copy(self):
        return self.__copy__()

回答 22

以这种方式访问​​字典键的注意事项和陷阱是什么?

正如@Henry所建议的那样,可能无法在dict中使用点分访问的一个原因是它将dict关键字名限制为python有效变量,从而限制了所有可能的名称。

下面的示例说明了在给定命令的情况下,为什么点访问通常不会有用d

有效期

以下属性在Python中无效:

d.1_foo                           # enumerated names
d./bar                            # path names
d.21.7, d.12:30                   # decimals, time
d.""                              # empty strings
d.john doe, d.denny's             # spaces, misc punctuation 
d.3 * x                           # expressions  

样式

PEP8约定将对属性命名施加软约束:

A.保留关键字(或内置函数)名称:

d.in
d.False, d.True
d.max, d.min
d.sum
d.id

如果函数参数的名称与保留关键字冲突,通常最好在其后附加一个下划线…

B.关于方法变量名的大小写规则:

变量名遵循与函数名相同的约定。

d.Firstname
d.Country

使用函数命名规则:小写字母,单词以下划线分隔,以提高可读性。


有时候,像熊猫这样的图书馆会引起这些担忧,该允许按名称对DataFrame列进行点访问。解决命名限制的默认机制也是数组符号-方括号中的字符串。

如果这些约束不适用于您的用例,则点访问数据结构上有多个选项。

What would be the caveats and pitfalls of accessing dict keys in this manner?

As @Henry suggests, one reason dotted-access may not be used in dicts is that it limits dict key names to python-valid variables, thereby restricting all possible names.

The following are examples on why dotted-access would not be helpful in general, given a dict, d:

Validity

The following attributes would be invalid in Python:

d.1_foo                           # enumerated names
d./bar                            # path names
d.21.7, d.12:30                   # decimals, time
d.""                              # empty strings
d.john doe, d.denny's             # spaces, misc punctuation 
d.3 * x                           # expressions  

Style

PEP8 conventions would impose a soft constraint on attribute naming:

A. Reserved keyword (or builtin function) names:

d.in
d.False, d.True
d.max, d.min
d.sum
d.id

If a function argument’s name clashes with a reserved keyword, it is generally better to append a single trailing underscore …

B. The case rule on methods and variable names:

Variable names follow the same convention as function names.

d.Firstname
d.Country

Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.


Sometimes these concerns are raised in libraries like pandas, which permits dotted-access of DataFrame columns by name. The default mechanism to resolve naming restrictions is also array-notation – a string within brackets.

If these constraints do not apply to your use case, there are several options on dotted-access data structures.


回答 23

您可以使用dict_to_obj https://pypi.org/project/dict-to-obj/ 它确实满足您的要求

From dict_to_obj import DictToObj
a = {
'foo': True
}
b = DictToObj(a)
b.foo
True

You can use dict_to_obj https://pypi.org/project/dict-to-obj/ It does exactly what you asked for

From dict_to_obj import DictToObj
a = {
'foo': True
}
b = DictToObj(a)
b.foo
True


回答 24

这不是一个“好”答案,但我认为这很不错(它不能处理当前形式的嵌套字典。)只需将您的字典包装在一个函数中:

def make_funcdict(d=None, **kwargs)
    def funcdict(d=None, **kwargs):
        if d is not None:
            funcdict.__dict__.update(d)
        funcdict.__dict__.update(kwargs)
        return funcdict.__dict__
    funcdict(d, **kwargs)
    return funcdict

现在您的语法略有不同。要像属性一样访问字典项f.key。要以通常的方式访问dict项目(和其他dict方法)f()['key'],我们可以方便地通过使用关键字参数和/或字典调用f来更新dict

d = {'name':'Henry', 'age':31}
d = make_funcdict(d)
>>> for key in d():
...     print key
... 
age
name
>>> print d.name
... Henry
>>> print d.age
... 31
>>> d({'Height':'5-11'}, Job='Carpenter')
... {'age': 31, 'name': 'Henry', 'Job': 'Carpenter', 'Height': '5-11'}

在那里。如果有人建议使用此方法的优点和缺点,我会很高兴。

This isn’t a ‘good’ answer, but I thought this was nifty (it doesn’t handle nested dicts in current form). Simply wrap your dict in a function:

def make_funcdict(d=None, **kwargs)
    def funcdict(d=None, **kwargs):
        if d is not None:
            funcdict.__dict__.update(d)
        funcdict.__dict__.update(kwargs)
        return funcdict.__dict__
    funcdict(d, **kwargs)
    return funcdict

Now you have slightly different syntax. To acces the dict items as attributes do f.key. To access the dict items (and other dict methods) in the usual manner do f()['key'] and we can conveniently update the dict by calling f with keyword arguments and/or a dictionary

Example

d = {'name':'Henry', 'age':31}
d = make_funcdict(d)
>>> for key in d():
...     print key
... 
age
name
>>> print d.name
... Henry
>>> print d.age
... 31
>>> d({'Height':'5-11'}, Job='Carpenter')
... {'age': 31, 'name': 'Henry', 'Job': 'Carpenter', 'Height': '5-11'}

And there it is. I’ll be happy if anyone suggests benefits and drawbacks of this method.


回答 25

正如Doug指出的那样,有一个Bunch软件包可用于实现该obj.key功能。其实有一个较新的版本叫做

NeoBunch

它具有很大的功能,可以通过它的neobunchify函数将您的字典转换为NeoBunch对象。我经常使用Mako模板,并且由于NeoBunch对象传递数据使它们更具可读性,因此,如果您碰巧最终在Python程序中使用了普通字典,但是想要在Mako模板中使用点符号,则可以这样使用:

from mako.template import Template
from neobunch import neobunchify

mako_template = Template(filename='mako.tmpl', strict_undefined=True)
data = {'tmpl_data': [{'key1': 'value1', 'key2': 'value2'}]}
with open('out.txt', 'w') as out_file:
    out_file.write(mako_template.render(**neobunchify(data)))

Mako模板可能如下所示:

% for d in tmpl_data:
Column1     Column2
${d.key1}   ${d.key2}
% endfor

As noted by Doug there’s a Bunch package which you can use to achieve the obj.key functionality. Actually there’s a newer version called

NeoBunch

It has though a great feature converting your dict to a NeoBunch object through its neobunchify function. I use Mako templates a lot and passing data as NeoBunch objects makes them far more readable, so if you happen to end up using a normal dict in your Python program but want the dot notation in a Mako template you can use it that way:

from mako.template import Template
from neobunch import neobunchify

mako_template = Template(filename='mako.tmpl', strict_undefined=True)
data = {'tmpl_data': [{'key1': 'value1', 'key2': 'value2'}]}
with open('out.txt', 'w') as out_file:
    out_file.write(mako_template.render(**neobunchify(data)))

And the Mako template could look like:

% for d in tmpl_data:
Column1     Column2
${d.key1}   ${d.key2}
% endfor

回答 26

最简单的方法是定义一个类,我们将其称为命名空间。它在字典上使用对象dict .update()。然后,该字典将被视为对象。

class Namespace(object):
    '''
    helps referencing object in a dictionary as dict.key instead of dict['key']
    '''
    def __init__(self, adict):
        self.__dict__.update(adict)



Person = Namespace({'name': 'ahmed',
                     'age': 30}) #--> added for edge_cls


print(Person.name)

The easiest way is to define a class let’s call it Namespace. which uses the object dict.update() on the dict. Then, the dict will be treated as an object.

class Namespace(object):
    '''
    helps referencing object in a dictionary as dict.key instead of dict['key']
    '''
    def __init__(self, adict):
        self.__dict__.update(adict)



Person = Namespace({'name': 'ahmed',
                     'age': 30}) #--> added for edge_cls


print(Person.name)

在迭代字典时如何从字典中删除项目?

问题:在迭代字典时如何从字典中删除项目?

在Python上进行迭代时从字典中删除项目是否合法?

例如:

for k, v in mydict.iteritems():
   if k == val:
     del mydict[k]

这个想法是从字典中删除不满足特定条件的元素,而不是创建一个新字典,该字典是被迭代的字典的子集。

这是一个好的解决方案吗?有没有更优雅/更有效的方法?

Is it legitimate to delete items from a dictionary in Python while iterating over it?

For example:

for k, v in mydict.iteritems():
   if k == val:
     del mydict[k]

The idea is to remove elements that don’t meet a certain condition from the dictionary, instead of creating a new dictionary that’s a subset of the one being iterated over.

Is this a good solution? Are there more elegant/efficient ways?


回答 0

编辑:

此答案不适用于Python3,并且会给出RuntimeError

RuntimeError:词典在迭代过程中更改了大小。

发生这种情况是因为mydict.keys()返回的是迭代器而不是列表。正如注释中所指出的那样,只需将其转换mydict.keys()为列表即可list(mydict.keys()),它应该可以工作。


控制台中的一个简单测试显示,在迭代字典时您无法修改字典:

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k, v in mydict.iteritems():
...    if k == 'two':
...        del mydict[k]
...
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

如delnan的回答所述,当迭代器尝试移至下一个条目时,删除条目会导致问题。而是使用keys()方法获取键列表并进行处理:

>>> for k in mydict.keys():
...    if k == 'two':
...        del mydict[k]
...
>>> mydict
{'four': 4, 'three': 3, 'one': 1}

如果需要根据项目值删除,请使用items()方法:

>>> for k, v in mydict.items():
...     if v == 3:
...         del mydict[k]
...
>>> mydict
{'four': 4, 'one': 1}

EDIT:

This answer will not work for Python3 and will give a RuntimeError.

RuntimeError: dictionary changed size during iteration.

This happens because mydict.keys() returns an iterator not a list. As pointed out in comments simply convert mydict.keys() to a list by list(mydict.keys()) and it should work.


A simple test in the console shows you cannot modify a dictionary while iterating over it:

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k, v in mydict.iteritems():
...    if k == 'two':
...        del mydict[k]
...
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

As stated in delnan’s answer, deleting entries causes problems when the iterator tries to move onto the next entry. Instead, use the keys() method to get a list of the keys and work with that:

>>> for k in mydict.keys():
...    if k == 'two':
...        del mydict[k]
...
>>> mydict
{'four': 4, 'three': 3, 'one': 1}

If you need to delete based on the items value, use the items() method instead:

>>> for k, v in mydict.items():
...     if v == 3:
...         del mydict[k]
...
>>> mydict
{'four': 4, 'one': 1}

回答 1

您也可以分两个步骤进行操作:

remove = [k for k in mydict if k == val]
for k in remove: del mydict[k]

我最喜欢的方法通常是做出一个新的决定:

# Python 2.7 and 3.x
mydict = { k:v for k,v in mydict.items() if k!=val }
# before Python 2.7
mydict = dict((k,v) for k,v in mydict.iteritems() if k!=val)

You could also do it in two steps:

remove = [k for k in mydict if k == val]
for k in remove: del mydict[k]

My favorite approach is usually to just make a new dict:

# Python 2.7 and 3.x
mydict = { k:v for k,v in mydict.items() if k!=val }
# before Python 2.7
mydict = dict((k,v) for k,v in mydict.iteritems() if k!=val)

回答 2

迭代时不能修改集合。那就是疯狂-最为明显的是,如果允许您删除和删除当前项目,则迭代器将必须继续(+1),下一次调用next将使您超出该范围(+2),因此您会最终跳过了一个元素(删除的元素后面的一个)。您有两种选择:

  • 复制所有键(或值,或两者,取决于您的需要),然后遍历这些键。您可以.keys()为此使用et al(在Python 3中,将生成的迭代器传递给list)。但是在空间上可能会非常浪费。
  • mydict照常进行迭代,将要保存的密钥保存在单独的collection中to_delete。当你完成迭代mydict,删除所有项目to_deletemydict。与第一种方法相比,可以节省一些(取决于删除的键数和剩余的键数)空间,但还需要多几行。

You can’t modify a collection while iterating it. That way lies madness – most notably, if you were allowed to delete and deleted the current item, the iterator would have to move on (+1) and the next call to next would take you beyond that (+2), so you’d end up skipping one element (the one right behind the one you deleted). You have two options:

  • Copy all keys (or values, or both, depending on what you need), then iterate over those. You can use .keys() et al for this (in Python 3, pass the resulting iterator to list). Could be highly wasteful space-wise though.
  • Iterate over mydict as usual, saving the keys to delete in a seperate collection to_delete. When you’re done iterating mydict, delete all items in to_delete from mydict. Saves some (depending on how many keys are deleted and how many stay) space over the first approach, but also requires a few more lines.

回答 3

而是遍历一个副本,例如items()

for k, v in list(mydict.items()):

Iterate over a copy instead, such as the one returned by items():

for k, v in list(mydict.items()):

回答 4

使用起来最干净list(mydict)

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k in list(mydict):
...     if k == 'three':
...         del mydict[k]
... 
>>> mydict
{'four': 4, 'two': 2, 'one': 1}

这对应于列表的并行结构:

>>> mylist = ['one', 'two', 'three', 'four']
>>> for k in list(mylist):                            # or mylist[:]
...     if k == 'three':
...         mylist.remove(k)
... 
>>> mylist
['one', 'two', 'four']

两者都在python2和python3中工作。

It’s cleanest to use list(mydict):

>>> mydict = {'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>> for k in list(mydict):
...     if k == 'three':
...         del mydict[k]
... 
>>> mydict
{'four': 4, 'two': 2, 'one': 1}

This corresponds to a parallel structure for lists:

>>> mylist = ['one', 'two', 'three', 'four']
>>> for k in list(mylist):                            # or mylist[:]
...     if k == 'three':
...         mylist.remove(k)
... 
>>> mylist
['one', 'two', 'four']

Both work in python2 and python3.


回答 5

您可以使用字典理解。

d = {k:d[k] for k in d if d[k] != val}

You can use a dictionary comprehension.

d = {k:d[k] for k in d if d[k] != val}


回答 6

使用python3,在dic.keys()上进行迭代将引发字典大小错误。您可以使用这种替代方式:

使用python3进行测试,它可以正常工作,并且不会引发错误“ 字典在迭代期间更改大小 ”:

my_dic = { 1:10, 2:20, 3:30 }
# Is important here to cast because ".keys()" method returns a dict_keys object.
key_list = list( my_dic.keys() )

# Iterate on the list:
for k in key_list:
    print(key_list)
    print(my_dic)
    del( my_dic[k] )


print( my_dic )
# {}

With python3, iterate on dic.keys() will raise the dictionary size error. You can use this alternative way:

Tested with python3, it works fine and the Error “dictionary changed size during iteration” is not raised:

my_dic = { 1:10, 2:20, 3:30 }
# Is important here to cast because ".keys()" method returns a dict_keys object.
key_list = list( my_dic.keys() )

# Iterate on the list:
for k in key_list:
    print(key_list)
    print(my_dic)
    del( my_dic[k] )


print( my_dic )
# {}

回答 7

您可以先构建要删除的键列表,然后遍历该列表以删除它们。

dict = {'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4}
delete = []
for k,v in dict.items():
    if v%2 == 1:
        delete.append(k)
for i in delete:
    del dict[i]

You could first build a list of keys to delete, and then iterate over that list deleting them.

dict = {'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4}
delete = []
for k,v in dict.items():
    if v%2 == 1:
        delete.append(k)
for i in delete:
    del dict[i]

回答 8

如果您要删除的项目始终位于dict迭代的“开始”,则有一种方法可能合适

while mydict:
    key, value = next(iter(mydict.items()))
    if should_delete(key, value):
       del mydict[key]
    else:
       break

仅保证“开始”对于某些Python版本/实现是一致的。例如,Python 3.7新增功能

dict对象的插入顺序保留性质已声明是Python语言规范的正式组成部分。

这种方式避免了很多其他答案所暗示的dict副本,至少在Python 3中如此。

There is a way that may be suitable if the items you want to delete are always at the “beginning” of the dict iteration

while mydict:
    key, value = next(iter(mydict.items()))
    if should_delete(key, value):
       del mydict[key]
    else:
       break

The “beginning” is only guaranteed to be consistent for certain Python versions/implementations. For example from What’s New In Python 3.7

the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.

This way avoids a copy of the dict that a lot of the other answers suggest, at least in Python 3.


回答 9

我在Python3中尝试了上述解决方案,但在将对象存储在dict中时,似乎这是唯一对我有用的解决方案。基本上,您会复制dict()并对其进行迭代,同时删除原始词典中的条目。

        tmpDict = realDict.copy()
        for key, value in tmpDict.items():
            if value:
                del(realDict[key])

I tried the above solutions in Python3 but this one seems to be the only one working for me when storing objects in a dict. Basically you make a copy of your dict() and iterate over that while deleting the entries in your original dictionary.

        tmpDict = realDict.copy()
        for key, value in tmpDict.items():
            if value:
                del(realDict[key])

检查给定键是否已存在于字典中并递增

问题:检查给定键是否已存在于字典中并递增

给定字典,我如何找出该字典中的给定键是否已设置为非值?

即,我想这样做:

my_dict = {}

if (my_dict[key] != None):
  my_dict[key] = 1
else:
  my_dict[key] += 1

即,如果要已有一个,我想增加该值,否则,请将该值设置为1。

Given a dictionary, how can I find out if a given key in that dictionary has already been set to a non-None value?

I.e., I want to do this:

my_dict = {}

if (my_dict[key] != None):
  my_dict[key] = 1
else:
  my_dict[key] += 1

I.e., I want to increment the value if there’s already one there, or set it to 1 otherwise.


回答 0

您正在寻找collections.defaultdict(适用于Python 2.5+)。这个

from collections import defaultdict

my_dict = defaultdict(int)
my_dict[key] += 1

会做你想要的。

对于常规Python而言dict,如果给定键没有值,则访问dict时不会获得结果NoneKeyError将会引发a。因此,如果您想使用Regular dict而不是代码,则可以使用

if key in my_dict:
    my_dict[key] += 1
else:
    my_dict[key] = 1

You are looking for collections.defaultdict (available for Python 2.5+). This

from collections import defaultdict

my_dict = defaultdict(int)
my_dict[key] += 1

will do what you want.

For regular Python dicts, if there is no value for a given key, you will not get None when accessing the dict — a KeyError will be raised. So if you want to use a regular dict, instead of your code you would use

if key in my_dict:
    my_dict[key] += 1
else:
    my_dict[key] = 1

回答 1

我更喜欢用一行代码来做到这一点。

my_dict = {}

my_dict [some_key] = my_dict.get(some_key,0)+ 1

字典具有一个函数get,该函数带有两个参数-所需的键和默认值(如果不存在)。我更喜欢这种方法作为defaultdict,因为您只想处理在这一行代码中不存在该键,而不是在所有地方都不存在该键的情况。

I prefer to do this in one line of code.

my_dict = {}

my_dict[some_key] = my_dict.get(some_key, 0) + 1

Dictionaries have a function, get, which takes two parameters – the key you want, and a default value if it doesn’t exist. I prefer this method to defaultdict as you only want to handle the case where the key doesn’t exist in this one line of code, not everywhere.


回答 2

我个人喜欢使用 setdefault()

my_dict = {}

my_dict.setdefault(some_key, 0)
my_dict[some_key] += 1

I personally like using setdefault()

my_dict = {}

my_dict.setdefault(some_key, 0)
my_dict[some_key] += 1

回答 3

您需要这样的key in dict成语。

if key in my_dict and not (my_dict[key] is None):
  # do something
else:
  # do something else

但是,您可能应该考虑使用defaultdict(按dF建议)。

You need the key in dict idiom for that.

if key in my_dict and not (my_dict[key] is None):
  # do something
else:
  # do something else

However, you should probably consider using defaultdict (as dF suggested).


回答 4

要回答“ 我如何找出该字典中的给定索引是否已设置为非值 ”的问题,我希望这样做:

try:
  nonNone = my_dict[key] is not None
except KeyError:
  nonNone = False

这符合已被引用的EAFP概念(更容易先请求宽恕然后再允许)。它也避免了字典中重复的键查找,因为key in my_dict and my_dict[key] is not None如果查找很昂贵,那会很有趣。

对于您提出的实际问题,即增加一个int(如果存在),或者将其设置为默认值,我也建议

my_dict[key] = my_dict.get(key, default) + 1

就像安德鲁·威尔金森(Andrew Wilkinson)的回答一样。

如果要在字典中存储可修改的对象,则有第三种解决方案。一个常见的示例是multimap,您可以在其中存储键的元素列表。在这种情况下,您可以使用:

my_dict.setdefault(key, []).append(item)

如果字典中不存在key的值,则setdefault方法会将其设置为setdefault的第二个参数。它的行为就像标准的my_dict [key]一样,返回键的值(可能是新设置的值)。

To answer the question “how can I find out if a given index in that dict has already been set to a non-None value“, I would prefer this:

try:
  nonNone = my_dict[key] is not None
except KeyError:
  nonNone = False

This conforms to the already invoked concept of EAFP (easier to ask forgiveness then permission). It also avoids the duplicate key lookup in the dictionary as it would in key in my_dict and my_dict[key] is not None what is interesting if lookup is expensive.

For the actual problem that you have posed, i.e. incrementing an int if it exists, or setting it to a default value otherwise, I also recommend the

my_dict[key] = my_dict.get(key, default) + 1

as in the answer of Andrew Wilkinson.

There is a third solution if you are storing modifyable objects in your dictionary. A common example for this is a multimap, where you store a list of elements for your keys. In that case, you can use:

my_dict.setdefault(key, []).append(item)

If a value for key does not exist in the dictionary, the setdefault method will set it to the second parameter of setdefault. It behaves just like a standard my_dict[key], returning the value for the key (which may be the newly set value).


回答 5

同意cgoldberg。我是怎么做的:

try:
    dict[key] += 1
except KeyError:
    dict[key] = 1

因此,要么如上所述,要么使用其他人建议的默认字典。不要使用if语句。那不是Pythonic。

Agreed with cgoldberg. How I do it is:

try:
    dict[key] += 1
except KeyError:
    dict[key] = 1

So either do it as above, or use a default dict as others have suggested. Don’t use if statements. That’s not Pythonic.


回答 6

从许多答案中可以看出,有几种解决方案。has_key()方法尚未提及LBYL的一个实例(三步前进)。

my_dict = {}

def add (key):
    if my_dict.has_key(key):
        my_dict[key] += 1
    else:
        my_dict[key] = 1

if __name__ == '__main__':
    add("foo")
    add("bar")
    add("foo")
    print my_dict

As you can see from the many answers, there are several solutions. One instance of LBYL (look before you leap) has not been mentioned yet, the has_key() method:

my_dict = {}

def add (key):
    if my_dict.has_key(key):
        my_dict[key] += 1
    else:
        my_dict[key] = 1

if __name__ == '__main__':
    add("foo")
    add("bar")
    add("foo")
    print my_dict

回答 7

您尝试执行此操作的方法称为LBYL(跳前先查看),因为您在尝试尝试增加值之前正在检查条件。

另一种方法称为EAFP(更容易先请求宽恕然后再允许)。在这种情况下,您只需尝试操作(增加值)。如果失败,则捕获该异常并将其值设置为1。这是使用Python的方式稍多一些(IMO)。

http://mail.python.org/pipermail/python-list/2003-May/205182.html

The way you are trying to do it is called LBYL (look before you leap), since you are checking conditions before trying to increment your value.

The other approach is called EAFP (easier to ask forgiveness then permission). In that case, you would just try the operation (increment the value). If it fails, you catch the exception and set the value to 1. This is a slightly more Pythonic way to do it (IMO).

http://mail.python.org/pipermail/python-list/2003-May/205182.html


回答 8

有点晚了,但这应该可行。

my_dict = {}
my_dict[key] = my_dict[key] + 1 if key in my_dict else 1

A bit late but this should work.

my_dict = {}
my_dict[key] = my_dict[key] + 1 if key in my_dict else 1

回答 9

这不是直接回答问题,但对我来说,您似乎可能需要collections.Counter的功能。

from collections import Counter

to_count = ["foo", "foo", "bar", "baz", "foo", "bar"]

count = Counter(to_count)

print(count)

print("acts just like the desired dictionary:")
print("bar occurs {} times".format(count["bar"]))

print("any item that does not occur in the list is set to 0:")
print("dog occurs {} times".format(count["dog"]))

print("can iterate over items from most frequent to least:")
for item, times in count.most_common():
    print("{} occurs {} times".format(item, times))

这导致输出

Counter({'foo': 3, 'bar': 2, 'baz': 1})
acts just like the desired dictionary:
bar occurs 2 times
any item that does not occur in the list is set to 0:
dog occurs 0 times
can iterate over items from most frequent to least:
foo occurs 3 times
bar occurs 2 times
baz occurs 1 times

This isn’t directly answering the question, but to me, it looks like you might want the functionality of collections.Counter.

from collections import Counter

to_count = ["foo", "foo", "bar", "baz", "foo", "bar"]

count = Counter(to_count)

print(count)

print("acts just like the desired dictionary:")
print("bar occurs {} times".format(count["bar"]))

print("any item that does not occur in the list is set to 0:")
print("dog occurs {} times".format(count["dog"]))

print("can iterate over items from most frequent to least:")
for item, times in count.most_common():
    print("{} occurs {} times".format(item, times))

This results in the output

Counter({'foo': 3, 'bar': 2, 'baz': 1})
acts just like the desired dictionary:
bar occurs 2 times
any item that does not occur in the list is set to 0:
dog occurs 0 times
can iterate over items from most frequent to least:
foo occurs 3 times
bar occurs 2 times
baz occurs 1 times

回答 10

这是我最近为解决此问题而想出的一种方法。它基于setdefault词典方法:

my_dict = {}
my_dict[key] = my_dict.setdefault(key, 0) + 1

Here’s one-liner that I came up with recently for solving this problem. It’s based on the setdefault dictionary method:

my_dict = {}
my_dict[key] = my_dict.setdefault(key, 0) + 1

回答 11

我一直在寻找它,没有在网上找到它,然后尝试使用Try / Error运气并找到了它

my_dict = {}

if my_dict.__contains__(some_key):
  my_dict[some_key] += 1
else:
  my_dict[some_key] = 1

I was looking for it, didn’t found it on web then tried my luck with Try/Error and found it

my_dict = {}

if my_dict.__contains__(some_key):
  my_dict[some_key] += 1
else:
  my_dict[some_key] = 1

在Python中,什么时候使用字典,列表或集合?

问题:在Python中,什么时候使用字典,列表或集合?

我什么时候应该使用字典,列表或集合?

是否存在更适合每种数据类型的方案?

When should I use a dictionary, list or set?

Are there scenarios that are more suited for each data type?


回答 0

一个list保持秩序,dictset不要:当你关心的秩序,因此,您必须使用list(如果你的容器的选择仅限于这三种,当然;-)。

dict与每个键关联一个值,而listset仅包含值:很明显,非常不同的用例。

set要求项目是可哈希的,list不是:如果您有不可哈希的项目,则不能使用,set而必须使用list

set禁止重复,list不禁止:也是至关重要的区别。(可以在以下位置找到“多重集”,该多重集将重复项映射到不止一次存在的项目的不同计数中;如果出于某些奇怪的原因而无法导入,则collections.Counter可以将其构建为,或者在2.7之前的版本中Python作为,使用项目作为键,并将相关值作为计数)。dictcollectionscollections.defaultdict(int)

set(或dict键中)中检查值的隶属关系非常快(花费一个恒定的短时间),而在列表中,它花费的时间与列表的长度成正比(在一般情况下和最坏情况下)。因此,如果您有可散列的项目,则不关心订单或重复项,而希望快速进行成员资格检查set比更好list

A list keeps order, dict and set don’t: when you care about order, therefore, you must use list (if your choice of containers is limited to these three, of course;-).

dict associates with each key a value, while list and set just contain values: very different use cases, obviously.

set requires items to be hashable, list doesn’t: if you have non-hashable items, therefore, you cannot use set and must instead use list.

set forbids duplicates, list does not: also a crucial distinction. (A “multiset”, which maps duplicates into a different count for items present more than once, can be found in collections.Counter — you could build one as a dict, if for some weird reason you couldn’t import collections, or, in pre-2.7 Python as a collections.defaultdict(int), using the items as keys and the associated value as the count).

Checking for membership of a value in a set (or dict, for keys) is blazingly fast (taking about a constant, short time), while in a list it takes time proportional to the list’s length in the average and worst cases. So, if you have hashable items, don’t care either way about order or duplicates, and want speedy membership checking, set is better than list.


回答 1

  • 您是否只需要订购的物品序列?取得清单。
  • 你只需要知道你是否已经一个特定的值,但不排序(你不需要存储复本)?使用一套。
  • 您是否需要将值与键相关联,以便稍后可以有效地(通过键)查找它们?使用字典。
  • Do you just need an ordered sequence of items? Go for a list.
  • Do you just need to know whether or not you’ve already got a particular value, but without ordering (and you don’t need to store duplicates)? Use a set.
  • Do you need to associate values with keys, so you can look them up efficiently (by key) later on? Use a dictionary.

回答 2

如果您想要无序的唯一元素集合,请使用set。(例如,当您要在文档中使用所有单词的集合时)。

当您想要收集元素的不可变的有序列表时,请使用tuple。(例如,当您希望将(名称,phone_number)对用作集合中的元素时,您将需要一个元组而不是一个列表,因为集合要求元素是不可变的。

当您想收集元素的可变的有序列表时,请使用list。(例如,当您要将新的电话号码追加到列表中时:[number1,number2,…])。

当您想要从键到值的映射时,请使用dict。(例如,当您需要将姓名映射到电话号码的电话簿时:){'John Smith' : '555-1212'}。请注意,字典中的键是无序的。(如果您遍历字典(电话簿),则按键(名称)可能以任何顺序显示)。

When you want an unordered collection of unique elements, use a set. (For example, when you want the set of all the words used in a document).

When you want to collect an immutable ordered list of elements, use a tuple. (For example, when you want a (name, phone_number) pair that you wish to use as an element in a set, you would need a tuple rather than a list since sets require elements be immutable).

When you want to collect a mutable ordered list of elements, use a list. (For example, when you want to append new phone numbers to a list: [number1, number2, …]).

When you want a mapping from keys to values, use a dict. (For example, when you want a telephone book which maps names to phone numbers: {'John Smith' : '555-1212'}). Note the keys in a dict are unordered. (If you iterate through a dict (telephone book), the keys (names) may show up in any order).


回答 3

  • 当您有一组映射到值的唯一键时,请使用字典。

  • 如果您有项目的有序集合,请使用列表。

  • 使用一组存储一组无序的项目。

  • Use a dictionary when you have a set of unique keys that map to values.

  • Use a list if you have an ordered collection of items.

  • Use a set to store an unordered set of items.


回答 4

简而言之,使用:

list -如果您需要订购的物品序列。

dict -如果您需要将值与键相关联

set -如果您需要保留唯一元素。

详细说明

清单

列表是可变序列,通常用于存储同类项目的集合。

列表实现了所有常见的序列操作:

  • x in lx not in l
  • l[i]l[i:j]l[i:j:k]
  • len(l)min(l)max(l)
  • l.count(x)
  • l.index(x[, i[, j]])-的第一出现的索引xl(在或之后i和之前j的indeces)

列表还实现了所有可变序列操作:

  • l[i] = x-项目il被替换x
  • l[i:j] = tlito的切片j被iterable的内容替换t
  • del l[i:j] – 如同 l[i:j] = []
  • l[i:j:k] = t-的元素l[i:j:k]已替换为t
  • del l[i:j:k]s[i:j:k]从列表中删除的元素
  • l.append(x)-追加x到序列的末尾
  • l.clear()-从中删除所有项目l(与del相同l[:]
  • l.copy()-创建的浅表副本l(与相同l[:]
  • l.extend(t)l += t-扩展l以下内容t
  • l *= n-更新l其内容重复n
  • l.insert(i, x)-插入xl由下式给出的指数在i
  • l.pop([i])-在处检索项目,i并将其从中删除l
  • l.remove(x)-从等于x的l位置删除第一项l[i]
  • l.reverse()-反转l到位的项目

利用方法append和可以将列表用作堆栈pop

字典

字典将可散列的值映射到任意对象。字典是可变对象。字典的主要操作是使用一些键存储值并提取给定键的值。

在字典中,不能将不可哈希的值(即包含列表,字典或其他可变类型的值)用作键。

集合是不同的可哈希对象的无序集合。集合通常用于进行成员资格测试,从序列中删除重复项以及计算数学运算(例如交集,并集,差和对称差)。

In short, use:

list – if you require an ordered sequence of items.

dict – if you require to relate values with keys

set – if you require to keep unique elements.

Detailed Explanation

List

A list is a mutable sequence, typically used to store collections of homogeneous items.

A list implements all of the common sequence operations:

  • x in l and x not in l
  • l[i], l[i:j], l[i:j:k]
  • len(l), min(l), max(l)
  • l.count(x)
  • l.index(x[, i[, j]]) – index of the 1st occurrence of x in l (at or after i and before j indeces)

A list also implements all of the mutable sequence operations:

  • l[i] = x – item i of l is replaced by x
  • l[i:j] = t – slice of l from i to j is replaced by the contents of the iterable t
  • del l[i:j] – same as l[i:j] = []
  • l[i:j:k] = t – the elements of l[i:j:k] are replaced by those of t
  • del l[i:j:k] – removes the elements of s[i:j:k] from the list
  • l.append(x) – appends x to the end of the sequence
  • l.clear() – removes all items from l (same as del l[:])
  • l.copy() – creates a shallow copy of l (same as l[:])
  • l.extend(t) or l += t – extends l with the contents of t
  • l *= n – updates l with its contents repeated n times
  • l.insert(i, x) – inserts x into l at the index given by i
  • l.pop([i]) – retrieves the item at i and also removes it from l
  • l.remove(x) – remove the first item from l where l[i] is equal to x
  • l.reverse() – reverses the items of l in place

A list could be used as stack by taking advantage of the methods append and pop.

Dictionary

A dictionary maps hashable values to arbitrary objects. A dictionary is a mutable object. The main operations on a dictionary are storing a value with some key and extracting the value given the key.

In a dictionary, you cannot use as keys values that are not hashable, that is, values containing lists, dictionaries or other mutable types.

Set

A set is an unordered collection of distinct hashable objects. A set is commonly used to include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.


回答 5

尽管这并不涵盖sets,但这是对dicts和lists 的很好解释:

列表看起来就是-值列表。它们中的每一个都从零开始编号-第一个从零开始编号,第二个为1,第三个为2,依此类推。您可以从列表中删除值,并在末尾添加新值。例如:您的许多猫的名字。

字典类似于其名称所暗示的内容-字典。在字典中,您有单词的“索引”,并且每个单词都有一个定义。在python中,单词称为“键”,而定义称为“值”。字典中的值未编号-类似于其名称所建议的名称-字典。在字典中,您有单词的“索引”,并且每个单词都有一个定义。字典中的值没有编号-它们也没有任何特定的顺序-键执行相同的操作。您可以添加,删除和修改字典中的值。例如:电话簿。

http://www.sthurlow.com/python/lesson06/

Although this doesn’t cover sets, it is a good explanation of dicts and lists:

Lists are what they seem – a list of values. Each one of them is numbered, starting from zero – the first one is numbered zero, the second 1, the third 2, etc. You can remove values from the list, and add new values to the end. Example: Your many cats’ names.

Dictionaries are similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – tare similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. The values in a dictionary aren’t numbered – they aren’t in any specific order, either – the key does the same thing. You can add, remove, and modify the values in dictionaries. Example: telephone book.

http://www.sthurlow.com/python/lesson06/


回答 6

对于C ++,我始终牢记以下流程图:在哪种情况下,我使用特定的STL容器?,所以我很好奇Python3是否也有类似的东西,但是我没有运气。

对于Python,需要记住的是:没有像C ++一样的Python标准。因此,不同的Python解释器(例如CPython,PyPy)可能会有巨大的差异。以下流程图适用于CPython。

另外,我发现包含以下数据结构到图中,没有什么好办法:bytesbyte arraystuplesnamed_tuplesChainMapCounter,和arrays

  • OrderedDict并且deque可以通过collections模块获得。
  • heapq可从heapq模块中获得
  • LifoQueueQueuePriorityQueue可以通过queue专门用于并发(线程)访问的模块获得。(也有一个multiprocessing.Queue可用的,但我不知道与它之间的区别,queue.Queue但是假设需要从进程进行并发访问时应该使用它。)
  • dictsetfrozen_set,和list被内置当然

对于任何人,如果您可以改善此答案并在各个方面提供更好的图表,我将不胜感激。随时欢迎。 流程图

PS:该图已通过yed制作。graphml文件在这里

For C++ I was always having this flow chart in mind: In which scenario do I use a particular STL container?, so I was curious if something similar is available for Python3 as well, but I had no luck.

What you need to keep in mind for Python is: There is no single Python standard as for C++. Hence there might be huge differences for different Python interpreters (e.g. CPython, PyPy). The following flow chart is for CPython.

Additionally I found no good way to incorporate the following data structures into the diagram: bytes, byte arrays, tuples, named_tuples, ChainMap, Counter, and arrays.

  • OrderedDict and deque are available via collections module.
  • heapq is available from the heapq module
  • LifoQueue, Queue, and PriorityQueue are available via the queue module which is designed for concurrent (threads) access. (There is also a multiprocessing.Queue available but I don’t know the differences to queue.Queue but would assume that it should be used when concurrent access from processes is needed.)
  • dict, set, frozen_set, and list are builtin of course

For anyone I would be grateful if you could improve this answer and provide a better diagram in every aspect. Feel free and welcome. flowchart

PS: the diagram has been made with yed. The graphml file is here


回答 7

结合列表字典集合,还有另一个有趣的python对象OrderedDicts

顺序词典与常规词典一样,但是它们记住项目插入的顺序。在有序字典上进行迭代时,将按照项的键首次添加的顺序返回项。

当您需要保留键的顺序(例如处理文档)时,OrderedDicts可能会很有用:通常需要文档中所有术语的向量表示。因此,使用OrderedDicts,您可以有效地验证术语是否已被阅读过,添加术语,提取术语,以及在所有操作之后可以提取它们的有序矢量表示。

In combination with lists, dicts and sets, there are also another interesting python objects, OrderedDicts.

Ordered dictionaries are just like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.

OrderedDicts could be useful when you need to preserve the order of the keys, for example working with documents: It’s common to need the vector representation of all terms in a document. So using OrderedDicts you can efficiently verify if a term has been read before, add terms, extract terms, and after all the manipulations you can extract the ordered vector representation of them.


回答 8

列表就是它们的外观-值列表。它们中的每一个都从零开始编号-第一个从零开始编号,第二个为1,第三个为2,依此类推。您可以从列表中删除值,并在末尾添加新值。例如:您的许多猫的名字。

元组就像列表一样,但是您不能更改它们的值。首先给出的值是程序其余部分所保持的值。同样,每个值都从零开始编号,以方便参考。示例:一年中的月份名称。

字典类似于其名称所暗示的内容-字典。在字典中,您有单词的“索引”,并且每个单词都有一个定义。在python中,单词称为“键”,而定义称为“值”。字典中的值未编号-类似于其名称所建议的名称-字典。在字典中,您有单词的“索引”,并且每个单词都有一个定义。在python中,单词称为“键”,而定义称为“值”。字典中的值没有编号-它们也没有任何特定的顺序-键执行相同的操作。您可以添加,删除和修改字典中的值。例如:电话簿。

Lists are what they seem – a list of values. Each one of them is numbered, starting from zero – the first one is numbered zero, the second 1, the third 2, etc. You can remove values from the list, and add new values to the end. Example: Your many cats’ names.

Tuples are just like lists, but you can’t change their values. The values that you give it first up, are the values that you are stuck with for the rest of the program. Again, each value is numbered starting from zero, for easy reference. Example: the names of the months of the year.

Dictionaries are similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – tare similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – they aren’t in any specific order, either – the key does the same thing. You can add, remove, and modify the values in dictionaries. Example: telephone book.


回答 9

在使用它们时,我会详尽列出它们的方法,以供您参考:

class ContainerMethods:
    def __init__(self):
        self.list_methods_11 = {
                    'Add':{'append','extend','insert'},
                    'Subtract':{'pop','remove'},
                    'Sort':{'reverse', 'sort'},
                    'Search':{'count', 'index'},
                    'Entire':{'clear','copy'},
                            }
        self.tuple_methods_2 = {'Search':'count','index'}

        self.dict_methods_11 = {
                    'Views':{'keys', 'values', 'items'},
                    'Add':{'update'},
                    'Subtract':{'pop', 'popitem',},
                    'Extract':{'get','setdefault',},
                    'Entire':{ 'clear', 'copy','fromkeys'},
                            }
        self.set_methods_17 ={
                    'Add':{['add', 'update'],['difference_update','symmetric_difference_update','intersection_update']},
                    'Subtract':{'pop', 'remove','discard'},
                    'Relation':{'isdisjoint', 'issubset', 'issuperset'},
                    'operation':{'union' 'intersection','difference', 'symmetric_difference'}
                    'Entire':{'clear', 'copy'}}

When use them, I make an exhaustive cheatsheet of their methods for your reference:

class ContainerMethods:
    def __init__(self):
        self.list_methods_11 = {
                    'Add':{'append','extend','insert'},
                    'Subtract':{'pop','remove'},
                    'Sort':{'reverse', 'sort'},
                    'Search':{'count', 'index'},
                    'Entire':{'clear','copy'},
                            }
        self.tuple_methods_2 = {'Search':'count','index'}

        self.dict_methods_11 = {
                    'Views':{'keys', 'values', 'items'},
                    'Add':{'update'},
                    'Subtract':{'pop', 'popitem',},
                    'Extract':{'get','setdefault',},
                    'Entire':{ 'clear', 'copy','fromkeys'},
                            }
        self.set_methods_17 ={
                    'Add':{['add', 'update'],['difference_update','symmetric_difference_update','intersection_update']},
                    'Subtract':{'pop', 'remove','discard'},
                    'Relation':{'isdisjoint', 'issubset', 'issuperset'},
                    'operation':{'union' 'intersection','difference', 'symmetric_difference'}
                    'Entire':{'clear', 'copy'}}

回答 10

字典:Python字典的用法类似于哈希表,其键为索引,对象为值。

列表:列表用于将对象保存在数组中,该对象由该对象在数组中的位置索引。

集合:集合是具有函数的集合,这些函数可以判断集合中是否存在对象。

Dictionary: A python dictionary is used like a hash table with key as index and object as value.

List: A list is used for holding objects in an array indexed by position of that object in the array.

Set: A set is a collection with functions that can tell if an object is present or not present in the set.


Python的内置词典如何实现?

问题:Python的内置词典如何实现?

有谁知道python内置字典类型是如何实现的?我的理解是,这是某种哈希表,但我无法找到任何确定的答案。

Does anyone know how the built in dictionary type for python is implemented? My understanding is that it is some sort of hash table, but I haven’t been able to find any sort of definitive answer.


回答 0

这是我能够汇总的有关Python字典的所有内容(可能比任何人都想知道的要多;但是答案很全面)。

  • Python字典实现为哈希表
  • 哈希表必须允许哈希冲突,即,即使两个不同的键具有相同的哈希值,表的实现也必须具有明确插入和检索键和值对的策略。
  • Python dict使用开放式寻址来解决哈希冲突(如下所述)(请参阅dictobject.c:296-297)。
  • Python哈希表只是一个连续的内存块(有点像一个数组,因此您可以O(1)按索引进行查找)。
  • 表中的每个插槽只能存储一个条目。这个很重要。
  • 中的每个条目实际上是三个值的组合:<hash,key,value>。这被实现为C结构(请参见dictobject.h:51-56)。
  • 下图是Python哈希表的逻辑表示。在下图中,0, 1, ..., i, ...左侧是哈希表中插槽的索引(它们仅用于说明目的,与表显然没有一起存储!)。

    # Logical model of Python Hash table
    -+-----------------+
    0| <hash|key|value>|
    -+-----------------+
    1|      ...        |
    -+-----------------+
    .|      ...        |
    -+-----------------+
    i|      ...        |
    -+-----------------+
    .|      ...        |
    -+-----------------+
    n|      ...        |
    -+-----------------+
    
  • 初始化新字典时,它从8 个插槽开始。(见dictobject.h:49

  • 在向表中添加条目时,我们从某个槽开始i,该槽基于键的哈希值。CPython最初使用i = hash(key) & mask(where mask = PyDictMINSIZE - 1,但这并不重要)。请注意,i选中的初始插槽取决于密钥的哈希值
  • 如果该插槽为空,则将条目添加到该插槽(通过输入,我是说<hash|key|value>)。但是,如果那个插槽被占用!?最可能是因为另一个条目具有相同的哈希(哈希冲突!)
  • 如果该插槽被占用,则CPython(甚至PyPy)将插槽中条目的哈希值与键(即==比较而不是is比较)与要插入的当前条目的哈希值和键(dictobject.c)进行比较。 :337,344-345)。如果两者都匹配,则认为该条目已存在,放弃并继续下一个要插入的条目。如果哈希或密钥不匹配,则开始探测
  • 探测仅表示它按插槽搜索插槽以找到一个空插槽。从技术上讲,我们可以一个接一个地i+1, i+2, ...使用,然后使用第一个可用的(线性探测)。但是由于注释中详细解释的原因(请参阅dictobject.c:33-126),CPython使用了随机探测。在随机探测中,以伪随机顺序选择下一个时隙。该条目将添加到第一个空插槽。对于此讨论,用于选择下一个时隙的实际算法并不是很重要(有关探测的算法,请参见dictobject.c:33-126)。重要的是对插槽进行探测,直到找到第一个空插槽为止。
  • 查找也会发生相同的情况,只是从初始插槽i(其中i取决于键的哈希值)开始。如果哈希和密钥都与插槽中的条目不匹配,它将开始探测,直到找到具有匹配项的插槽。如果所有插槽均已耗尽,则报告失败。
  • 顺便说一句,dict如果三分之二满了,它将被调整大小。这样可以避免减慢查找速度。(见dictobject.h:64-65

注意:我对Python Dict的实现进行了研究,以回答我自己的问题:字典中的多个条目如何具有相同的哈希值。我在此处发布了对此回复的略作修改的版本,因为所有的研究也都与此问题相关。

Here is everything about Python dicts that I was able to put together (probably more than anyone would like to know; but the answer is comprehensive).

  • Python dictionaries are implemented as hash tables.

  • Hash tables must allow for hash collisions i.e. even if two distinct keys have the same hash value, the table’s implementation must have a strategy to insert and retrieve the key and value pairs unambiguously.

  • Python dict uses open addressing to resolve hash collisions (explained below) (see dictobject.c:296-297).

  • Python hash table is just a contiguous block of memory (sort of like an array, so you can do an O(1) lookup by index).

  • Each slot in the table can store one and only one entry. This is important.

  • Each entry in the table is actually a combination of the three values: < hash, key, value >. This is implemented as a C struct (see dictobject.h:51-56).

  • The figure below is a logical representation of a Python hash table. In the figure below, 0, 1, ..., i, ... on the left are indices of the slots in the hash table (they are just for illustrative purposes and are not stored along with the table obviously!).

      # Logical model of Python Hash table
      -+-----------------+
      0| <hash|key|value>|
      -+-----------------+
      1|      ...        |
      -+-----------------+
      .|      ...        |
      -+-----------------+
      i|      ...        |
      -+-----------------+
      .|      ...        |
      -+-----------------+
      n|      ...        |
      -+-----------------+
    
  • When a new dict is initialized it starts with 8 slots. (see dictobject.h:49)

  • When adding entries to the table, we start with some slot, i, that is based on the hash of the key. CPython initially uses i = hash(key) & mask (where mask = PyDictMINSIZE - 1, but that’s not really important). Just note that the initial slot, i, that is checked depends on the hash of the key.

  • If that slot is empty, the entry is added to the slot (by entry, I mean, <hash|key|value>). But what if that slot is occupied!? Most likely because another entry has the same hash (hash collision!)

  • If the slot is occupied, CPython (and even PyPy) compares the hash AND the key (by compare I mean == comparison not the is comparison) of the entry in the slot against the hash and key of the current entry to be inserted (dictobject.c:337,344-345) respectively. If both match, then it thinks the entry already exists, gives up and moves on to the next entry to be inserted. If either hash or the key don’t match, it starts probing.

  • Probing just means it searches the slots by slot to find an empty slot. Technically we could just go one by one, i+1, i+2, ... and use the first available one (that’s linear probing). But for reasons explained beautifully in the comments (see dictobject.c:33-126), CPython uses random probing. In random probing, the next slot is picked in a pseudo random order. The entry is added to the first empty slot. For this discussion, the actual algorithm used to pick the next slot is not really important (see dictobject.c:33-126 for the algorithm for probing). What is important is that the slots are probed until first empty slot is found.

  • The same thing happens for lookups, just starts with the initial slot i (where i depends on the hash of the key). If the hash and the key both don’t match the entry in the slot, it starts probing, until it finds a slot with a match. If all slots are exhausted, it reports a fail.

  • BTW, the dict will be resized if it is two-thirds full. This avoids slowing down lookups. (see dictobject.h:64-65)

NOTE: I did the research on Python Dict implementation in response to my own question about how multiple entries in a dict can have same hash values. I posted a slightly edited version of the response here because all the research is very relevant for this question as well.


回答 1

Python的内置词典如何实现?

这是短期类:

  • 它们是哈希表。(有关Python实现的详细信息,请参见下文。)
  • 从Python 3.6开始,新的布局和算法使它们
    • 通过插入密钥排序,以及
    • 占用更少的空间,
    • 几乎不牺牲性能。
  • 当字典共享密钥时(在特殊情况下),另一种优化方法可以节省空间。

从Python 3.6开始,有序方面是非官方的(给其他实现一个跟上的机会),但在Python 3.7中却是正式的

Python的字典是哈希表

长期以来,它完全像这样工作。Python将预分配8个空行,并使用哈希值确定键值对的粘贴位置。例如,如果密钥的哈希值以001结尾,则它将其保留在1(即2nd)索引中(如以下示例所示)。

   <hash>       <key>    <value>
     null        null    null
...010001    ffeb678c    633241c4 # addresses of the keys and values
     null        null    null
      ...         ...    ...

在64位体系结构上,每一行占用24个字节,在32位体系结构上占用12个字节。(请注意,列标题只是出于我们的目的而使用的标签-它们实际上并不存在于内存中。)

如果散列的结尾与预先存在的键的散列相同,则为冲突,然后它将键值对保留在不同的位置。

存储5个键值之后,添加另一个键值对时,哈希冲突的可能性太大,因此字典的大小增加了一倍。在64位处理中,在调整大小之前,我们有72个字节为空,而在此之后,由于有10个空行,我们浪费了240个字节。

这需要很多空间,但是查找时间是相当恒定的。密钥比较算法是计算哈希值,转到预期位置,比较密钥的ID-如果它们是同一对象,则它们相等。如果没有,那么比较的哈希值,如果他们一样,他们是不相等的。否则,我们最终比较键是否相等,如果相等,则返回值。最终的相等比较可能会很慢,但是较早的检查通常会缩短最终的比较,从而使查找非常快。

冲突会减慢速度,并且从理论上讲,攻击者可以使用哈希冲突来执行拒绝服务攻击,因此我们对哈希函数的初始化进行了随机化处理,以便为每个新的Python进程计算不同的哈希。

上述浪费的空间使我们修改了字典的实现,具有令人兴奋的新功能,即现在可以通过插入来对字典进行排序。

新的紧凑型哈希表

相反,我们首先为插入索引预分配一个数组。

由于我们的第一个键值对位于第二个插槽中,因此我们这样进行索引:

[null, 0, null, null, null, null, null, null]

并且我们的表只是按插入顺序填充:

   <hash>       <key>    <value>
...010001    ffeb678c    633241c4 
      ...         ...    ...

因此,当我们查找键时,我们使用哈希检查我们期望的位置(在这种情况下,我们直接转到数组的索引1),然后转到哈希表中的该索引(例如索引0) ),检查键是否相等(使用前面所述的相同算法),如果相等,则返回值。

我们保留了恒定的查找时间,在某些情况下速度损失较小,而在另一些情况下速度有所增加,其好处是,与现有的实现相比,它可以节省大量空间,并且可以保留插入顺序。唯一浪费的空间是索引数组中的空字节。

Raymond Hettinger 于2012年12月在python-dev上引入了此功能。它最终在Python 3.6中进入了CPython 。通过插入排序被认为是3.6的实现细节,以使Python的其他实现有机会赶上。

共用金钥

节省空间的另一种优化方法是共享密钥的实现。因此,我们没有重复使用共享密钥和密钥散列的冗余字典,而不是拥有占据所有空间的冗余字典。您可以这样想:

     hash         key    dict_0    dict_1    dict_2...
...010001    ffeb678c    633241c4  fffad420  ...
      ...         ...    ...       ...       ...

对于64位计算机,每个额外的字典每个键最多可以节省16个字节。

自定义对象和替代项的共享密钥

这些共享密钥字典旨在用于自定义对象__dict__。为了获得这种行为,我相信您需要__dict__在实例化下一个对象之前完成填充(请参阅PEP 412)。这意味着您应该在__init__或中分配所有属性__new__,否则可能无法节省空间。

但是,如果您知道执行时的所有属性,则__init__还可以提供__slots__对象,并保证__dict__根本不会创建该对象(如果在父级中不可用),甚至允许__dict__但保证您可以预见的属性是仍然存储在插槽中。有关更多信息__slots__请在此处查看我的答案

也可以看看:

How are Python’s Built In Dictionaries Implemented?

Here’s the short course:

  • They are hash tables. (See below for the specifics of Python’s implementation.)
  • A new layout and algorithm, as of Python 3.6, makes them
    • ordered by key insertion, and
    • take up less space,
    • at virtually no cost in performance.
  • Another optimization saves space when dicts share keys (in special cases).

The ordered aspect is unofficial as of Python 3.6 (to give other implementations a chance to keep up), but official in Python 3.7.

Python’s Dictionaries are Hash Tables

For a long time, it worked exactly like this. Python would preallocate 8 empty rows and use the hash to determine where to stick the key-value pair. For example, if the hash for the key ended in 001, it would stick it in the 1 (i.e. 2nd) index (like the example below.)

   <hash>       <key>    <value>
     null        null    null
...010001    ffeb678c    633241c4 # addresses of the keys and values
     null        null    null
      ...         ...    ...

Each row takes up 24 bytes on a 64 bit architecture, 12 on a 32 bit. (Note that the column headers are just labels for our purposes here – they don’t actually exist in memory.)

If the hash ended the same as a preexisting key’s hash, this is a collision, and then it would stick the key-value pair in a different location.

After 5 key-values are stored, when adding another key-value pair, the probability of hash collisions is too large, so the dictionary is doubled in size. In a 64 bit process, before the resize, we have 72 bytes empty, and after, we are wasting 240 bytes due to the 10 empty rows.

This takes a lot of space, but the lookup time is fairly constant. The key comparison algorithm is to compute the hash, go to the expected location, compare the key’s id – if they’re the same object, they’re equal. If not then compare the hash values, if they are not the same, they’re not equal. Else, then we finally compare keys for equality, and if they are equal, return the value. The final comparison for equality can be quite slow, but the earlier checks usually shortcut the final comparison, making the lookups very quick.

Collisions slow things down, and an attacker could theoretically use hash collisions to perform a denial of service attack, so we randomized the initialization of the hash function such that it computes different hashes for each new Python process.

The wasted space described above has led us to modify the implementation of dictionaries, with an exciting new feature that dictionaries are now ordered by insertion.

The New Compact Hash Tables

We start, instead, by preallocating an array for the index of the insertion.

Since our first key-value pair goes in the second slot, we index like this:

[null, 0, null, null, null, null, null, null]

And our table just gets populated by insertion order:

   <hash>       <key>    <value>
...010001    ffeb678c    633241c4 
      ...         ...    ...

So when we do a lookup for a key, we use the hash to check the position we expect (in this case, we go straight to index 1 of the array), then go to that index in the hash-table (e.g. index 0), check that the keys are equal (using the same algorithm described earlier), and if so, return the value.

We retain constant lookup time, with minor speed losses in some cases and gains in others, with the upsides that we save quite a lot of space over the pre-existing implementation and we retain insertion order. The only space wasted are the null bytes in the index array.

Raymond Hettinger introduced this on python-dev in December of 2012. It finally got into CPython in Python 3.6. Ordering by insertion was considered an implementation detail for 3.6 to allow other implementations of Python a chance to catch up.

Shared Keys

Another optimization to save space is an implementation that shares keys. Thus, instead of having redundant dictionaries that take up all of that space, we have dictionaries that reuse the shared keys and keys’ hashes. You can think of it like this:

     hash         key    dict_0    dict_1    dict_2...
...010001    ffeb678c    633241c4  fffad420  ...
      ...         ...    ...       ...       ...

For a 64 bit machine, this could save up to 16 bytes per key per extra dictionary.

Shared Keys for Custom Objects & Alternatives

These shared-key dicts are intended to be used for custom objects’ __dict__. To get this behavior, I believe you need to finish populating your __dict__ before you instantiate your next object (see PEP 412). This means you should assign all your attributes in the __init__ or __new__, else you might not get your space savings.

However, if you know all of your attributes at the time your __init__ is executed, you could also provide __slots__ for your object, and guarantee that __dict__ is not created at all (if not available in parents), or even allow __dict__ but guarantee that your foreseen attributes are stored in slots anyways. For more on __slots__, see my answer here.

See also:


回答 2

Python字典使用Open寻址在Beautiful代码中参考

注意! 如Wikipedia所述,开放式寻址(又名封闭式哈希)不应与其相反的开放式哈希混淆

开放式寻址意味着dict使用数组插槽,并且当对象的主要位置位于dict中时,使用“扰动”方案在同一数组中的不同索引处查找对象的位置,其中对象的哈希值发挥了作用。

Python Dictionaries use Open addressing (reference inside Beautiful code)

NB! Open addressing, a.k.a closed hashing should, as noted in Wikipedia, not be confused with its opposite open hashing!

Open addressing means that the dict uses array slots, and when an object’s primary position is taken in the dict, the object’s spot is sought at a different index in the same array, using a “perturbation” scheme, where the object’s hash value plays part.


如何漂亮地打印嵌套词典?

问题:如何漂亮地打印嵌套词典?

如何在Python中打印深度约为4的字典?我尝试使用进行漂亮的打印pprint(),但是没有用:

import pprint 
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(mydict)

我只是想"\t"为每个嵌套添加一个缩进(),以便获得如下所示的内容:

key1
    value1
    value2
    key2
       value1
       value2

等等

我怎样才能做到这一点?

How can I pretty print a dictionary with depth of ~4 in Python? I tried pretty printing with pprint(), but it did not work:

import pprint 
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(mydict)

I simply want an indentation ("\t") for each nesting, so that I get something like this:

key1
    value1
    value2
    key2
       value1
       value2

etc.

How can I do this?


回答 0

我不确定您希望格式看起来如何,但是可以从这样的函数开始:

def pretty(d, indent=0):
   for key, value in d.items():
      print('\t' * indent + str(key))
      if isinstance(value, dict):
         pretty(value, indent+1)
      else:
         print('\t' * (indent+1) + str(value))

I’m not sure how exactly you want the formatting to look like, but you could start with a function like this:

def pretty(d, indent=0):
   for key, value in d.items():
      print('\t' * indent + str(key))
      if isinstance(value, dict):
         pretty(value, indent+1)
      else:
         print('\t' * (indent+1) + str(value))

回答 1

我首先想到的是JSON序列化程序可能非常擅长嵌套字典,因此我会作弊并使用它:

>>> import json
>>> print json.dumps({'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}},
...                  sort_keys=True, indent=4)
{
    "a": 2,
    "b": {
        "x": 3,
        "y": {
            "t1": 4,
            "t2": 5
        }
    }
}

My first thought was that the JSON serializer is probably pretty good at nested dictionaries, so I’d cheat and use that:

>>> import json
>>> print json.dumps({'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}},
...                  sort_keys=True, indent=4)
{
    "a": 2,
    "b": {
        "x": 3,
        "y": {
            "t1": 4,
            "t2": 5
        }
    }
}

回答 2

您可以通过PyYAML尝试YAML。其输出可以微调。我建议从以下内容开始:

print yaml.dump(data, allow_unicode=True, default_flow_style=False)

结果非常可读;如果需要,也可以将其解析回Python。

编辑:

例:

>>> import yaml
>>> data = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
>>> print yaml.dump(data, default_flow_style=False)
a: 2
b:
  x: 3
  y:
    t1: 4
    t2: 5

You could try YAML via PyYAML. Its output can be fine-tuned. I’d suggest starting with the following:

print yaml.dump(data, allow_unicode=True, default_flow_style=False)

The result is very readable; it can be also parsed back to Python if needed.

Edit:

Example:

>>> import yaml
>>> data = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
>>> print yaml.dump(data, default_flow_style=False)
a: 2
b:
  x: 3
  y:
    t1: 4
    t2: 5

回答 3

到目前为止,我还没有看到任何漂亮的打印机至少以非常简单的格式模仿python解释器的输出,所以这是我的:

class Formatter(object):
    def __init__(self):
        self.types = {}
        self.htchar = '\t'
        self.lfchar = '\n'
        self.indent = 0
        self.set_formater(object, self.__class__.format_object)
        self.set_formater(dict, self.__class__.format_dict)
        self.set_formater(list, self.__class__.format_list)
        self.set_formater(tuple, self.__class__.format_tuple)

    def set_formater(self, obj, callback):
        self.types[obj] = callback

    def __call__(self, value, **args):
        for key in args:
            setattr(self, key, args[key])
        formater = self.types[type(value) if type(value) in self.types else object]
        return formater(self, value, self.indent)

    def format_object(self, value, indent):
        return repr(value)

    def format_dict(self, value, indent):
        items = [
            self.lfchar + self.htchar * (indent + 1) + repr(key) + ': ' +
            (self.types[type(value[key]) if type(value[key]) in self.types else object])(self, value[key], indent + 1)
            for key in value
        ]
        return '{%s}' % (','.join(items) + self.lfchar + self.htchar * indent)

    def format_list(self, value, indent):
        items = [
            self.lfchar + self.htchar * (indent + 1) + (self.types[type(item) if type(item) in self.types else object])(self, item, indent + 1)
            for item in value
        ]
        return '[%s]' % (','.join(items) + self.lfchar + self.htchar * indent)

    def format_tuple(self, value, indent):
        items = [
            self.lfchar + self.htchar * (indent + 1) + (self.types[type(item) if type(item) in self.types else object])(self, item, indent + 1)
            for item in value
        ]
        return '(%s)' % (','.join(items) + self.lfchar + self.htchar * indent)

要初始化它:

pretty = Formatter()

它可以支持为定义的类型添加格式器,您只需要为此创建一个函数,然后使用set_formater将其绑定到所需的类型即可:

from collections import OrderedDict

def format_ordereddict(self, value, indent):
    items = [
        self.lfchar + self.htchar * (indent + 1) +
        "(" + repr(key) + ', ' + (self.types[
            type(value[key]) if type(value[key]) in self.types else object
        ])(self, value[key], indent + 1) + ")"
        for key in value
    ]
    return 'OrderedDict([%s])' % (','.join(items) +
           self.lfchar + self.htchar * indent)
pretty.set_formater(OrderedDict, format_ordereddict)

出于历史原因,我保留了以前的漂亮打印机,它是一个函数而不是一个类,但是它们都可以以相同的方式使用,该类版本只允许更多功能:

def pretty(value, htchar='\t', lfchar='\n', indent=0):
    nlch = lfchar + htchar * (indent + 1)
    if type(value) is dict:
        items = [
            nlch + repr(key) + ': ' + pretty(value[key], htchar, lfchar, indent + 1)
            for key in value
        ]
        return '{%s}' % (','.join(items) + lfchar + htchar * indent)
    elif type(value) is list:
        items = [
            nlch + pretty(item, htchar, lfchar, indent + 1)
            for item in value
        ]
        return '[%s]' % (','.join(items) + lfchar + htchar * indent)
    elif type(value) is tuple:
        items = [
            nlch + pretty(item, htchar, lfchar, indent + 1)
            for item in value
        ]
        return '(%s)' % (','.join(items) + lfchar + htchar * indent)
    else:
        return repr(value)

要使用它:

>>> a = {'list':['a','b',1,2],'dict':{'a':1,2:'b'},'tuple':('a','b',1,2),'function':pretty,'unicode':u'\xa7',("tuple","key"):"valid"}
>>> a
{'function': <function pretty at 0x7fdf555809b0>, 'tuple': ('a', 'b', 1, 2), 'list': ['a', 'b', 1, 2], 'dict': {'a': 1, 2: 'b'}, 'unicode': u'\xa7', ('tuple', 'key'): 'valid'}
>>> print(pretty(a))
{
    'function': <function pretty at 0x7fdf555809b0>,
    'tuple': (
        'a',
        'b',
        1,
        2
    ),
    'list': [
        'a',
        'b',
        1,
        2
    ],
    'dict': {
        'a': 1,
        2: 'b'
    },
    'unicode': u'\xa7',
    ('tuple', 'key'): 'valid'
}

与其他版本相比:

  • 该解决方案直接寻找对象类型,因此您几乎可以打印几乎所有内容,而不仅是列表或字典。
  • 没有任何依赖性。
  • 一切都放在一个字符串中,因此您可以使用它进行任何操作。
  • 该类和函数已经过测试,可与Python 2.7和3.4一起使用。
  • 您可以在其中包含所有类型的对象,这是它们的表示形式,而不是放入结果中的它们的内容(因此,字符串带有引号,Unicode字符串可以完全表示…)。
  • 使用类版本,可以为所需的每种对象类型添加格式,也可以为已定义的对象类型更改格式。
  • 键可以是任何有效类型。
  • 缩进和换行符可以根据我们的需要进行更改。
  • 字典,列表和元组很漂亮。

As of what have been done, I don’t see any pretty printer that at least mimics the output of the python interpreter with very simple formatting so here’s mine :

class Formatter(object):
    def __init__(self):
        self.types = {}
        self.htchar = '\t'
        self.lfchar = '\n'
        self.indent = 0
        self.set_formater(object, self.__class__.format_object)
        self.set_formater(dict, self.__class__.format_dict)
        self.set_formater(list, self.__class__.format_list)
        self.set_formater(tuple, self.__class__.format_tuple)

    def set_formater(self, obj, callback):
        self.types[obj] = callback

    def __call__(self, value, **args):
        for key in args:
            setattr(self, key, args[key])
        formater = self.types[type(value) if type(value) in self.types else object]
        return formater(self, value, self.indent)

    def format_object(self, value, indent):
        return repr(value)

    def format_dict(self, value, indent):
        items = [
            self.lfchar + self.htchar * (indent + 1) + repr(key) + ': ' +
            (self.types[type(value[key]) if type(value[key]) in self.types else object])(self, value[key], indent + 1)
            for key in value
        ]
        return '{%s}' % (','.join(items) + self.lfchar + self.htchar * indent)

    def format_list(self, value, indent):
        items = [
            self.lfchar + self.htchar * (indent + 1) + (self.types[type(item) if type(item) in self.types else object])(self, item, indent + 1)
            for item in value
        ]
        return '[%s]' % (','.join(items) + self.lfchar + self.htchar * indent)

    def format_tuple(self, value, indent):
        items = [
            self.lfchar + self.htchar * (indent + 1) + (self.types[type(item) if type(item) in self.types else object])(self, item, indent + 1)
            for item in value
        ]
        return '(%s)' % (','.join(items) + self.lfchar + self.htchar * indent)

To initialize it :

pretty = Formatter()

It can support the addition of formatters for defined types, you simply need to make a function for that like this one and bind it to the type you want with set_formater :

from collections import OrderedDict

def format_ordereddict(self, value, indent):
    items = [
        self.lfchar + self.htchar * (indent + 1) +
        "(" + repr(key) + ', ' + (self.types[
            type(value[key]) if type(value[key]) in self.types else object
        ])(self, value[key], indent + 1) + ")"
        for key in value
    ]
    return 'OrderedDict([%s])' % (','.join(items) +
           self.lfchar + self.htchar * indent)
pretty.set_formater(OrderedDict, format_ordereddict)

For historical reasons, I keep the previous pretty printer which was a function instead of a class, but they both can be used the same way, the class version simply permit much more :

def pretty(value, htchar='\t', lfchar='\n', indent=0):
    nlch = lfchar + htchar * (indent + 1)
    if type(value) is dict:
        items = [
            nlch + repr(key) + ': ' + pretty(value[key], htchar, lfchar, indent + 1)
            for key in value
        ]
        return '{%s}' % (','.join(items) + lfchar + htchar * indent)
    elif type(value) is list:
        items = [
            nlch + pretty(item, htchar, lfchar, indent + 1)
            for item in value
        ]
        return '[%s]' % (','.join(items) + lfchar + htchar * indent)
    elif type(value) is tuple:
        items = [
            nlch + pretty(item, htchar, lfchar, indent + 1)
            for item in value
        ]
        return '(%s)' % (','.join(items) + lfchar + htchar * indent)
    else:
        return repr(value)

To use it :

>>> a = {'list':['a','b',1,2],'dict':{'a':1,2:'b'},'tuple':('a','b',1,2),'function':pretty,'unicode':u'\xa7',("tuple","key"):"valid"}
>>> a
{'function': <function pretty at 0x7fdf555809b0>, 'tuple': ('a', 'b', 1, 2), 'list': ['a', 'b', 1, 2], 'dict': {'a': 1, 2: 'b'}, 'unicode': u'\xa7', ('tuple', 'key'): 'valid'}
>>> print(pretty(a))
{
    'function': <function pretty at 0x7fdf555809b0>,
    'tuple': (
        'a',
        'b',
        1,
        2
    ),
    'list': [
        'a',
        'b',
        1,
        2
    ],
    'dict': {
        'a': 1,
        2: 'b'
    },
    'unicode': u'\xa7',
    ('tuple', 'key'): 'valid'
}

Compared to other versions :

  • This solution looks directly for object type, so you can pretty print almost everything, not only list or dict.
  • Doesn’t have any dependancy.
  • Everything is put inside a string, so you can do whatever you want with it.
  • The class and the function has been tested and works with Python 2.7 and 3.4.
  • You can have all type of objects inside, this is their representations and not theirs contents that being put in the result (so string have quotes, Unicode string are fully represented …).
  • With the class version, you can add formatting for every object type you want or change them for already defined ones.
  • key can be of any valid type.
  • Indent and Newline character can be changed for everything we’d like.
  • Dict, List and Tuples are pretty printed.

回答 4

通过这种方式,您可以以漂亮的方式打印它,例如您的词典名称是yasin

import json

print (json.dumps(yasin, indent=2))

By this way you can print it in pretty way for example your dictionary name is yasin

import json

print (json.dumps(yasin, indent=2))

回答 5

另一种选择yapf

from pprint import pformat
from yapf.yapflib.yapf_api import FormatCode

dict_example = {'1': '1', '2': '2', '3': [1, 2, 3, 4, 5], '4': {'1': '1', '2': '2', '3': [1, 2, 3, 4, 5]}}
dict_string = pformat(dict_example)
formatted_code, _ = FormatCode(dict_string)

print(formatted_code)

输出:

{
    '1': '1',
    '2': '2',
    '3': [1, 2, 3, 4, 5],
    '4': {
        '1': '1',
        '2': '2',
        '3': [1, 2, 3, 4, 5]
    }
}

Another option with yapf:

from pprint import pformat
from yapf.yapflib.yapf_api import FormatCode

dict_example = {'1': '1', '2': '2', '3': [1, 2, 3, 4, 5], '4': {'1': '1', '2': '2', '3': [1, 2, 3, 4, 5]}}
dict_string = pformat(dict_example)
formatted_code, _ = FormatCode(dict_string)

print(formatted_code)

Output:

{
    '1': '1',
    '2': '2',
    '3': [1, 2, 3, 4, 5],
    '4': {
        '1': '1',
        '2': '2',
        '3': [1, 2, 3, 4, 5]
    }
}

回答 6

正如其他人所发表的,您可以使用递归/ dfs打印嵌套的字典数据,如果它是字典,则可以递归调用;否则打印数据。

def print_json(data):
    if type(data) == dict:
            for k, v in data.items():
                    print k
                    print_json(v)
    else:
            print data

As others have posted, you can use recursion/dfs to print the nested dictionary data and call recursively if it is a dictionary; otherwise print the data.

def print_json(data):
    if type(data) == dict:
            for k, v in data.items():
                    print k
                    print_json(v)
    else:
            print data

回答 7

pout可以漂亮地打印出您扔给它的任何东西,例如(data从另一个答案中借用):

data = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
pout.vs(data)

将导致输出打印到屏幕上,例如:

{
    'a': 2,
    'b':
    {
        'y':
        {
            't2': 5,
            't1': 4
        },
        'x': 3
    }
}

或者,您可以返回对象的格式化字符串输出:

v = pout.s(data)

它的主要用例是用于调试,因此它不会阻塞对象实例或任何事物,它可以处理unicode输出,可以在python 2.7和3中工作。

披露:我是pout的作者和维护者。

pout can pretty print anything you throw at it, for example (borrowing data from another answer):

data = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
pout.vs(data)

would result in output printed to the screen like:

{
    'a': 2,
    'b':
    {
        'y':
        {
            't2': 5,
            't1': 4
        },
        'x': 3
    }
}

or you can return the formatted string output of your object:

v = pout.s(data)

Its primary use case is for debugging so it doesn’t choke on object instances or anything and it handles unicode output as you would expect, works in python 2.7 and 3.

disclosure: I’m the author and maintainer of pout.


回答 8

最有效的方法之一就是使用已经构建的pprint模块。

定义打印深度所需的参数如您所料 depth

import pprint
pp = pprint.PrettyPrinter(depth=4)
pp.pprint(mydict)

而已 !

One of the most pythonic ways for that is to use the already build pprint module.

The argument that you need for define the print depth is as you may expect depth

import pprint
pp = pprint.PrettyPrinter(depth=4)
pp.pprint(mydict)

That’s it !


回答 9

我听了sth的回答,并对其做了一些修改,以适应嵌套字典和列表的需求:

def pretty(d, indent=0):
    if isinstance(d, dict):
        for key, value in d.iteritems():
            print '\t' * indent + str(key)
            if isinstance(value, dict) or isinstance(value, list):
                pretty(value, indent+1)
            else:
                print '\t' * (indent+1) + str(value)
    elif isinstance(d, list):
        for item in d:
            if isinstance(item, dict) or isinstance(item, list):
                pretty(item, indent+1)
            else:
                print '\t' * (indent+1) + str(item)
    else:
        pass

然后给我的输出像:

>>> 
xs:schema
    @xmlns:xs
        http://www.w3.org/2001/XMLSchema
    xs:redefine
        @schemaLocation
            base.xsd
        xs:complexType
            @name
                Extension
            xs:complexContent
                xs:restriction
                    @base
                        Extension
                    xs:sequence
                        xs:element
                            @name
                                Policy
                            @minOccurs
                                1
                            xs:complexType
                                xs:sequence
                                    xs:element
                                            ...

I took sth’s answer and modified it slightly to fit my needs of a nested dictionaries and lists:

def pretty(d, indent=0):
    if isinstance(d, dict):
        for key, value in d.iteritems():
            print '\t' * indent + str(key)
            if isinstance(value, dict) or isinstance(value, list):
                pretty(value, indent+1)
            else:
                print '\t' * (indent+1) + str(value)
    elif isinstance(d, list):
        for item in d:
            if isinstance(item, dict) or isinstance(item, list):
                pretty(item, indent+1)
            else:
                print '\t' * (indent+1) + str(item)
    else:
        pass

Which then gives me output like:

>>> 
xs:schema
    @xmlns:xs
        http://www.w3.org/2001/XMLSchema
    xs:redefine
        @schemaLocation
            base.xsd
        xs:complexType
            @name
                Extension
            xs:complexContent
                xs:restriction
                    @base
                        Extension
                    xs:sequence
                        xs:element
                            @name
                                Policy
                            @minOccurs
                                1
                            xs:complexType
                                xs:sequence
                                    xs:element
                                            ...

回答 10

……我下沉那很漂亮;)

def pretty(d, indent=0):
    for key, value in d.iteritems():
        if isinstance(value, dict):
            print '\t' * indent + (("%30s: {\n") % str(key).upper())
            pretty(value, indent+1)
            print '\t' * indent + ' ' * 32 + ('} # end of %s #\n' % str(key).upper())
        elif isinstance(value, list):
            for val in value:
                print '\t' * indent + (("%30s: [\n") % str(key).upper())
                pretty(val, indent+1)
                print '\t' * indent + ' ' * 32 + ('] # end of %s #\n' % str(key).upper())
        else:
            print '\t' * indent + (("%30s: %s") % (str(key).upper(),str(value)))

Sth, i sink that’s pretty ;)

def pretty(d, indent=0):
    for key, value in d.iteritems():
        if isinstance(value, dict):
            print '\t' * indent + (("%30s: {\n") % str(key).upper())
            pretty(value, indent+1)
            print '\t' * indent + ' ' * 32 + ('} # end of %s #\n' % str(key).upper())
        elif isinstance(value, list):
            for val in value:
                print '\t' * indent + (("%30s: [\n") % str(key).upper())
                pretty(val, indent+1)
                print '\t' * indent + ' ' * 32 + ('] # end of %s #\n' % str(key).upper())
        else:
            print '\t' * indent + (("%30s: %s") % (str(key).upper(),str(value)))

回答 11

This class prints out a complex nested dictionary with sub dictionaries and sub lists.  
##
## Recursive class to parse and print complex nested dictionary
##

class NestedDictionary(object):
    def __init__(self,value):
        self.value=value

    def print(self,depth):
        spacer="--------------------"
        if type(self.value)==type(dict()):
            for kk, vv in self.value.items():
                if (type(vv)==type(dict())):
                    print(spacer[:depth],kk)
                    vvv=(NestedDictionary(vv))
                    depth=depth+3
                    vvv.print(depth)
                    depth=depth-3
                else:
                    if (type(vv)==type(list())):
                        for i in vv:
                            vvv=(NestedDictionary(i))
                            depth=depth+3
                            vvv.print(depth)
                            depth=depth-3
                    else:
                        print(spacer[:depth],kk,vv) 

##
## Instatiate and execute - this prints complex nested dictionaries
## with sub dictionaries and sub lists
## 'something' is a complex nested dictionary

MyNest=NestedDictionary(weather_com_result)
MyNest.print(0)
This class prints out a complex nested dictionary with sub dictionaries and sub lists.  
##
## Recursive class to parse and print complex nested dictionary
##

class NestedDictionary(object):
    def __init__(self,value):
        self.value=value

    def print(self,depth):
        spacer="--------------------"
        if type(self.value)==type(dict()):
            for kk, vv in self.value.items():
                if (type(vv)==type(dict())):
                    print(spacer[:depth],kk)
                    vvv=(NestedDictionary(vv))
                    depth=depth+3
                    vvv.print(depth)
                    depth=depth-3
                else:
                    if (type(vv)==type(list())):
                        for i in vv:
                            vvv=(NestedDictionary(i))
                            depth=depth+3
                            vvv.print(depth)
                            depth=depth-3
                    else:
                        print(spacer[:depth],kk,vv) 

##
## Instatiate and execute - this prints complex nested dictionaries
## with sub dictionaries and sub lists
## 'something' is a complex nested dictionary

MyNest=NestedDictionary(weather_com_result)
MyNest.print(0)

回答 12

我写了这个简单的代码来打印Python中json对象的一般结构。

def getstructure(data, tab = 0):
    if type(data) is dict:
        print ' '*tab + '{' 
        for key in data:
            print ' '*tab + '  ' + key + ':'
            getstructure(data[key], tab+4)
        print ' '*tab + '}'         
    elif type(data) is list and len(data) > 0:
        print ' '*tab + '['
        getstructure(data[0], tab+4)
        print ' '*tab + '  ...'
        print ' '*tab + ']'

以下数据的结果

a = {'list':['a','b',1,2],'dict':{'a':1,2:'b'},'tuple':('a','b',1,2),'function':'p','unicode':u'\xa7',("tuple","key"):"valid"}
getstructure(a)

非常紧凑,看起来像这样:

{
  function:
  tuple:
  list:
    [
      ...
    ]
  dict:
    {
      a:
      2:
    }
  unicode:
  ('tuple', 'key'):
}

I wrote this simple code to print the general structure of a json object in Python.

def getstructure(data, tab = 0):
    if type(data) is dict:
        print ' '*tab + '{' 
        for key in data:
            print ' '*tab + '  ' + key + ':'
            getstructure(data[key], tab+4)
        print ' '*tab + '}'         
    elif type(data) is list and len(data) > 0:
        print ' '*tab + '['
        getstructure(data[0], tab+4)
        print ' '*tab + '  ...'
        print ' '*tab + ']'

the result for the following data

a = {'list':['a','b',1,2],'dict':{'a':1,2:'b'},'tuple':('a','b',1,2),'function':'p','unicode':u'\xa7',("tuple","key"):"valid"}
getstructure(a)

is very compact and looks like this:

{
  function:
  tuple:
  list:
    [
      ...
    ]
  dict:
    {
      a:
      2:
    }
  unicode:
  ('tuple', 'key'):
}

回答 13

我本人是一位相对较新的python新手,但过去两周来我一直在使用嵌套字典,而这正是我想出的。

您应该尝试使用堆栈。使根字典中的键成为列表的列表:

stack = [ root.keys() ]     # Result: [ [root keys] ]

从最后到第一个相反的顺序,查找字典中的每个键,以查看其值是否也是字典。如果不是,请打印密钥,然后将其删除。但是,如果键的值字典,请打印键,然后将该值的键附加到堆栈的末尾,并以相同的方式开始处理该列表,对每个新的键列表进行递归重复。

如果每个列表中第二个键的值是一个字典,则在几轮之后您将得到类似以下的内容:

[['key 1','key 2'],['key 2.1','key 2.2'],['key 2.2.1','key 2.2.2'],[`etc.`]]

这种方法的好处是缩进量只是\t堆栈长度的倍数:

indent = "\t" * len(stack)

缺点是,为了检查每个键,您需要将其散列到相关的子词典,尽管这可以通过列表理解和简单的for循环轻松地进行处理:

path = [li[-1] for li in stack]
# The last key of every list of keys in the stack

sub = root
for p in path:
    sub = sub[p]


if type(sub) == dict:
    stack.append(sub.keys()) # And so on

请注意,这种方法将需要您清理尾随的空列表,删除任何列表中的最后一个键,然后再删除一个空列表(当然这可能会创建另一个空列表,依此类推)。

还有其他方法可以实现此方法,但希望它为您提供了如何执行此操作的基本思路。

编辑:如果您不想遍历所有内容,该pprint模块将以一种不错的格式打印嵌套字典。

I’m a relative python newbie myself but I’ve been working with nested dictionaries for the past couple weeks and this is what I had came up with.

You should try using a stack. Make the keys from the root dictionary into a list of a list:

stack = [ root.keys() ]     # Result: [ [root keys] ]

Going in reverse order from last to first, lookup each key in the dictionary to see if its value is (also) a dictionary. If not, print the key then delete it. However if the value for the key is a dictionary, print the key then append the keys for that value to the end of the stack, and start processing that list in the same way, repeating recursively for each new list of keys.

If the value for the second key in each list were a dictionary you would have something like this after several rounds:

[['key 1','key 2'],['key 2.1','key 2.2'],['key 2.2.1','key 2.2.2'],[`etc.`]]

The upside to this approach is that the indent is just \t times the length of the stack:

indent = "\t" * len(stack)

The downside is that in order to check each key you need to hash through to the relevant sub-dictionary, though this can be handled easily with a list comprehension and a simple for loop:

path = [li[-1] for li in stack]
# The last key of every list of keys in the stack

sub = root
for p in path:
    sub = sub[p]


if type(sub) == dict:
    stack.append(sub.keys()) # And so on

Be aware that this approach will require you to cleanup trailing empty lists, and to delete the last key in any list followed by an empty list (which of course may create another empty list, and so on).

There are other ways to implement this approach but hopefully this gives you a basic idea of how to do it.

EDIT: If you don’t want to go through all that, the pprint module prints nested dictionaries in a nice format.


回答 14

这是我根据某人的评论编写的函数。它的工作原理与带缩进的json.dumps相同,但是我使用的是制表符而不是缩进的空间。在Python 3.2+中,您可以直接将缩进指定为’\ t’,但在2.7中则不能。

def pretty_dict(d):
    def pretty(d, indent):
        for i, (key, value) in enumerate(d.iteritems()):
            if isinstance(value, dict):
                print '{0}"{1}": {{'.format( '\t' * indent, str(key))
                pretty(value, indent+1)
                if i == len(d)-1:
                    print '{0}}}'.format( '\t' * indent)
                else:
                    print '{0}}},'.format( '\t' * indent)
            else:
                if i == len(d)-1:
                    print '{0}"{1}": "{2}"'.format( '\t' * indent, str(key), value)
                else:
                    print '{0}"{1}": "{2}",'.format( '\t' * indent, str(key), value)
    print '{'
    pretty(d,indent=1)
    print '}'

例如:

>>> dict_var = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
>>> pretty_dict(dict_var)
{
    "a": "2",
    "b": {
        "y": {
            "t2": "5",
            "t1": "4"
        },
        "x": "3"
    }
}

Here’s a function I wrote based on what sth’s comment. It’s works the same as json.dumps with indent, but I’m using tabs instead of space for indents. In Python 3.2+ you can specify indent to be a ‘\t’ directly, but not in 2.7.

def pretty_dict(d):
    def pretty(d, indent):
        for i, (key, value) in enumerate(d.iteritems()):
            if isinstance(value, dict):
                print '{0}"{1}": {{'.format( '\t' * indent, str(key))
                pretty(value, indent+1)
                if i == len(d)-1:
                    print '{0}}}'.format( '\t' * indent)
                else:
                    print '{0}}},'.format( '\t' * indent)
            else:
                if i == len(d)-1:
                    print '{0}"{1}": "{2}"'.format( '\t' * indent, str(key), value)
                else:
                    print '{0}"{1}": "{2}",'.format( '\t' * indent, str(key), value)
    print '{'
    pretty(d,indent=1)
    print '}'

Ex:

>>> dict_var = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
>>> pretty_dict(dict_var)
{
    "a": "2",
    "b": {
        "y": {
            "t2": "5",
            "t1": "4"
        },
        "x": "3"
    }
}

回答 15

这将打印任何形式的嵌套字典,同时跟踪整个“父”字典。

dicList = list()

def prettierPrint(dic, dicList):
count = 0
for key, value in dic.iteritems():
    count+=1
    if str(value) == 'OrderedDict()':
        value = None
    if not isinstance(value, dict):
        print str(key) + ": " + str(value)
        print str(key) + ' was found in the following path:',
        print dicList
        print '\n'
    elif isinstance(value, dict):
        dicList.append(key)
        prettierPrint(value, dicList)
    if dicList:
         if count == len(dic):
             dicList.pop()
             count = 0

prettierPrint(dicExample, dicList)

这是根据不同格式进行打印的良好起点,例如OP中指定的格式。您真正需要做的就是围绕“ 打印”块进行操作。请注意,它会查看该值是否为“ OrderedDict()”。根据您是否使用的是Container数据类型Collections中的某些内容,应进行此类故障保护,以使elif块由于其名称而不会将其视为其他字典。截至目前,示例字典

example_dict = {'key1': 'value1',
            'key2': 'value2',
            'key3': {'key3a': 'value3a'},
            'key4': {'key4a': {'key4aa': 'value4aa',
                               'key4ab': 'value4ab',
                               'key4ac': 'value4ac'},
                     'key4b': 'value4b'}

将打印

key3a: value3a
key3a was found in the following path: ['key3']

key2: value2
key2 was found in the following path: []

key1: value1
key1 was found in the following path: []

key4ab: value4ab
key4ab was found in the following path: ['key4', 'key4a']

key4ac: value4ac
key4ac was found in the following path: ['key4', 'key4a']

key4aa: value4aa
key4aa was found in the following path: ['key4', 'key4a']

key4b: value4b
key4b was found in the following path: ['key4']

〜更改代码以适合问题的格式〜

lastDict = list()
dicList = list()
def prettierPrint(dic, dicList):
    global lastDict
    count = 0
    for key, value in dic.iteritems():
        count+=1
        if str(value) == 'OrderedDict()':
            value = None
        if not isinstance(value, dict):
            if lastDict == dicList:
                sameParents = True
            else:
                sameParents = False

            if dicList and sameParents is not True:
                spacing = ' ' * len(str(dicList))
                print dicList
                print spacing,
                print str(value)

            if dicList and sameParents is True:
                print spacing,
                print str(value)
            lastDict = list(dicList)

        elif isinstance(value, dict):
            dicList.append(key)
            prettierPrint(value, dicList)

        if dicList:
             if count == len(dic):
                 dicList.pop()
                 count = 0

使用相同的示例代码,它将打印以下内容:

['key3']
         value3a
['key4', 'key4a']
                  value4ab
                  value4ac
                  value4aa
['key4']
         value4b

这不正是什么要求在OP。不同之处在于,仍然打印出parent ^ n,而不是不打印并替换为空白。要获得OP的格式,您需要执行以下操作:将dicListlastDict进行迭代比较。您可以通过一个新的字典和dicList的内容复制到它做到这一点,检查,如果在复制的字典是一样的在lastDict,和-如果它是-书写空白到使用字符串倍增器功能定位。

Here’s something that will print any sort of nested dictionary, while keeping track of the “parent” dictionaries along the way.

dicList = list()

def prettierPrint(dic, dicList):
count = 0
for key, value in dic.iteritems():
    count+=1
    if str(value) == 'OrderedDict()':
        value = None
    if not isinstance(value, dict):
        print str(key) + ": " + str(value)
        print str(key) + ' was found in the following path:',
        print dicList
        print '\n'
    elif isinstance(value, dict):
        dicList.append(key)
        prettierPrint(value, dicList)
    if dicList:
         if count == len(dic):
             dicList.pop()
             count = 0

prettierPrint(dicExample, dicList)

This is a good starting point for printing according to different formats, like the one specified in OP. All you really need to do is operations around the Print blocks. Note that it looks to see if the value is ‘OrderedDict()’. Depending on whether you’re using something from Container datatypes Collections, you should make these sort of fail-safes so the elif block doesn’t see it as an additional dictionary due to its name. As of now, an example dictionary like

example_dict = {'key1': 'value1',
            'key2': 'value2',
            'key3': {'key3a': 'value3a'},
            'key4': {'key4a': {'key4aa': 'value4aa',
                               'key4ab': 'value4ab',
                               'key4ac': 'value4ac'},
                     'key4b': 'value4b'}

will print

key3a: value3a
key3a was found in the following path: ['key3']

key2: value2
key2 was found in the following path: []

key1: value1
key1 was found in the following path: []

key4ab: value4ab
key4ab was found in the following path: ['key4', 'key4a']

key4ac: value4ac
key4ac was found in the following path: ['key4', 'key4a']

key4aa: value4aa
key4aa was found in the following path: ['key4', 'key4a']

key4b: value4b
key4b was found in the following path: ['key4']

~altering code to fit the question’s format~

lastDict = list()
dicList = list()
def prettierPrint(dic, dicList):
    global lastDict
    count = 0
    for key, value in dic.iteritems():
        count+=1
        if str(value) == 'OrderedDict()':
            value = None
        if not isinstance(value, dict):
            if lastDict == dicList:
                sameParents = True
            else:
                sameParents = False

            if dicList and sameParents is not True:
                spacing = ' ' * len(str(dicList))
                print dicList
                print spacing,
                print str(value)

            if dicList and sameParents is True:
                print spacing,
                print str(value)
            lastDict = list(dicList)

        elif isinstance(value, dict):
            dicList.append(key)
            prettierPrint(value, dicList)

        if dicList:
             if count == len(dic):
                 dicList.pop()
                 count = 0

Using the same example code, it will print the following:

['key3']
         value3a
['key4', 'key4a']
                  value4ab
                  value4ac
                  value4aa
['key4']
         value4b

This isn’t exactly what is requested in OP. The difference is that a parent^n is still printed, instead of being absent and replaced with white-space. To get to OP’s format, you’ll need to do something like the following: iteratively compare dicList with the lastDict. You can do this by making a new dictionary and copying dicList’s content to it, checking if i in the copied dictionary is the same as i in lastDict, and — if it is — writing whitespace to that i position using the string multiplier function.


回答 16

从此链接

def prnDict(aDict, br='\n', html=0,
            keyAlign='l',   sortKey=0,
            keyPrefix='',   keySuffix='',
            valuePrefix='', valueSuffix='',
            leftMargin=0,   indent=1 ):
    '''
return a string representive of aDict in the following format:
    {
     key1: value1,
     key2: value2,
     ...
     }

Spaces will be added to the keys to make them have same width.

sortKey: set to 1 if want keys sorted;
keyAlign: either 'l' or 'r', for left, right align, respectively.
keyPrefix, keySuffix, valuePrefix, valueSuffix: The prefix and
   suffix to wrap the keys or values. Good for formatting them
   for html document(for example, keyPrefix='<b>', keySuffix='</b>'). 
   Note: The keys will be padded with spaces to have them
         equally-wide. The pre- and suffix will be added OUTSIDE
         the entire width.
html: if set to 1, all spaces will be replaced with '&nbsp;', and
      the entire output will be wrapped with '<code>' and '</code>'.
br: determine the carriage return. If html, it is suggested to set
    br to '<br>'. If you want the html source code eazy to read,
    set br to '<br>\n'

version: 04b52
author : Runsun Pan
require: odict() # an ordered dict, if you want the keys sorted.
         Dave Benjamin 
         http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/161403
    '''

    if aDict:

        #------------------------------ sort key
        if sortKey:
            dic = aDict.copy()
            keys = dic.keys()
            keys.sort()
            aDict = odict()
            for k in keys:
                aDict[k] = dic[k]

        #------------------- wrap keys with ' ' (quotes) if str
        tmp = ['{']
        ks = [type(x)==str and "'%s'"%x or x for x in aDict.keys()]

        #------------------- wrap values with ' ' (quotes) if str
        vs = [type(x)==str and "'%s'"%x or x for x in aDict.values()] 

        maxKeyLen = max([len(str(x)) for x in ks])

        for i in range(len(ks)):

            #-------------------------- Adjust key width
            k = {1            : str(ks[i]).ljust(maxKeyLen),
                 keyAlign=='r': str(ks[i]).rjust(maxKeyLen) }[1]

            v = vs[i]        
            tmp.append(' '* indent+ '%s%s%s:%s%s%s,' %(
                        keyPrefix, k, keySuffix,
                        valuePrefix,v,valueSuffix))

        tmp[-1] = tmp[-1][:-1] # remove the ',' in the last item
        tmp.append('}')

        if leftMargin:
          tmp = [ ' '*leftMargin + x for x in tmp ]

        if html:
            return '<code>%s</code>' %br.join(tmp).replace(' ','&nbsp;')
        else:
            return br.join(tmp)     
    else:
        return '{}'

'''
Example:

>>> a={'C': 2, 'B': 1, 'E': 4, (3, 5): 0}

>>> print prnDict(a)
{
 'C'   :2,
 'B'   :1,
 'E'   :4,
 (3, 5):0
}

>>> print prnDict(a, sortKey=1)
{
 'B'   :1,
 'C'   :2,
 'E'   :4,
 (3, 5):0
}

>>> print prnDict(a, keyPrefix="<b>", keySuffix="</b>")
{
 <b>'C'   </b>:2,
 <b>'B'   </b>:1,
 <b>'E'   </b>:4,
 <b>(3, 5)</b>:0
}

>>> print prnDict(a, html=1)
<code>{
&nbsp;'C'&nbsp;&nbsp;&nbsp;:2,
&nbsp;'B'&nbsp;&nbsp;&nbsp;:1,
&nbsp;'E'&nbsp;&nbsp;&nbsp;:4,
&nbsp;(3,&nbsp;5):0
}</code>

>>> b={'car': [6, 6, 12], 'about': [15, 9, 6], 'bookKeeper': [9, 9, 15]}

>>> print prnDict(b, sortKey=1)
{
 'about'     :[15, 9, 6],
 'bookKeeper':[9, 9, 15],
 'car'       :[6, 6, 12]
}

>>> print prnDict(b, keyAlign="r")
{
        'car':[6, 6, 12],
      'about':[15, 9, 6],
 'bookKeeper':[9, 9, 15]
}
'''

From this link:

def prnDict(aDict, br='\n', html=0,
            keyAlign='l',   sortKey=0,
            keyPrefix='',   keySuffix='',
            valuePrefix='', valueSuffix='',
            leftMargin=0,   indent=1 ):
    '''
return a string representive of aDict in the following format:
    {
     key1: value1,
     key2: value2,
     ...
     }

Spaces will be added to the keys to make them have same width.

sortKey: set to 1 if want keys sorted;
keyAlign: either 'l' or 'r', for left, right align, respectively.
keyPrefix, keySuffix, valuePrefix, valueSuffix: The prefix and
   suffix to wrap the keys or values. Good for formatting them
   for html document(for example, keyPrefix='<b>', keySuffix='</b>'). 
   Note: The keys will be padded with spaces to have them
         equally-wide. The pre- and suffix will be added OUTSIDE
         the entire width.
html: if set to 1, all spaces will be replaced with '&nbsp;', and
      the entire output will be wrapped with '<code>' and '</code>'.
br: determine the carriage return. If html, it is suggested to set
    br to '<br>'. If you want the html source code eazy to read,
    set br to '<br>\n'

version: 04b52
author : Runsun Pan
require: odict() # an ordered dict, if you want the keys sorted.
         Dave Benjamin 
         http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/161403
    '''

    if aDict:

        #------------------------------ sort key
        if sortKey:
            dic = aDict.copy()
            keys = dic.keys()
            keys.sort()
            aDict = odict()
            for k in keys:
                aDict[k] = dic[k]

        #------------------- wrap keys with ' ' (quotes) if str
        tmp = ['{']
        ks = [type(x)==str and "'%s'"%x or x for x in aDict.keys()]

        #------------------- wrap values with ' ' (quotes) if str
        vs = [type(x)==str and "'%s'"%x or x for x in aDict.values()] 

        maxKeyLen = max([len(str(x)) for x in ks])

        for i in range(len(ks)):

            #-------------------------- Adjust key width
            k = {1            : str(ks[i]).ljust(maxKeyLen),
                 keyAlign=='r': str(ks[i]).rjust(maxKeyLen) }[1]

            v = vs[i]        
            tmp.append(' '* indent+ '%s%s%s:%s%s%s,' %(
                        keyPrefix, k, keySuffix,
                        valuePrefix,v,valueSuffix))

        tmp[-1] = tmp[-1][:-1] # remove the ',' in the last item
        tmp.append('}')

        if leftMargin:
          tmp = [ ' '*leftMargin + x for x in tmp ]

        if html:
            return '<code>%s</code>' %br.join(tmp).replace(' ','&nbsp;')
        else:
            return br.join(tmp)     
    else:
        return '{}'

'''
Example:

>>> a={'C': 2, 'B': 1, 'E': 4, (3, 5): 0}

>>> print prnDict(a)
{
 'C'   :2,
 'B'   :1,
 'E'   :4,
 (3, 5):0
}

>>> print prnDict(a, sortKey=1)
{
 'B'   :1,
 'C'   :2,
 'E'   :4,
 (3, 5):0
}

>>> print prnDict(a, keyPrefix="<b>", keySuffix="</b>")
{
 <b>'C'   </b>:2,
 <b>'B'   </b>:1,
 <b>'E'   </b>:4,
 <b>(3, 5)</b>:0
}

>>> print prnDict(a, html=1)
<code>{
&nbsp;'C'&nbsp;&nbsp;&nbsp;:2,
&nbsp;'B'&nbsp;&nbsp;&nbsp;:1,
&nbsp;'E'&nbsp;&nbsp;&nbsp;:4,
&nbsp;(3,&nbsp;5):0
}</code>

>>> b={'car': [6, 6, 12], 'about': [15, 9, 6], 'bookKeeper': [9, 9, 15]}

>>> print prnDict(b, sortKey=1)
{
 'about'     :[15, 9, 6],
 'bookKeeper':[9, 9, 15],
 'car'       :[6, 6, 12]
}

>>> print prnDict(b, keyAlign="r")
{
        'car':[6, 6, 12],
      'about':[15, 9, 6],
 'bookKeeper':[9, 9, 15]
}
'''

回答 17

我只是在回答了sth的答案并做了一个很小但非常有用的修改后才回到这个问题。此函数将打印JSON树中的所有以及该树中叶节点大小

def print_JSON_tree(d, indent=0):
    for key, value in d.iteritems():
        print '    ' * indent + unicode(key),
        if isinstance(value, dict):
            print; print_JSON_tree(value, indent+1)
        else:
            print ":", str(type(d[key])).split("'")[1], "-", str(len(unicode(d[key])))

当您有大型JSON对象并想弄清楚肉在哪里时,这非常好。范例

>>> print_JSON_tree(JSON_object)
key1
    value1 : int - 5
    value2 : str - 16
    key2
       value1 : str - 34
       value2 : list - 5623456

这将告诉您,您关心的大多数数据可能都在内部,JSON_object['key1']['key2']['value2']因为格式化为字符串的值的长度非常大。

I’m just returning to this question after taking sth‘s answer and making a small but very useful modification. This function prints all keys in the JSON tree as well as the size of leaf nodes in that tree.

def print_JSON_tree(d, indent=0):
    for key, value in d.iteritems():
        print '    ' * indent + unicode(key),
        if isinstance(value, dict):
            print; print_JSON_tree(value, indent+1)
        else:
            print ":", str(type(d[key])).split("'")[1], "-", str(len(unicode(d[key])))

It’s really nice when you have large JSON objects and want to figure out where the meat is. Example:

>>> print_JSON_tree(JSON_object)
key1
    value1 : int - 5
    value2 : str - 16
    key2
       value1 : str - 34
       value2 : list - 5623456

This would tell you that most of the data you care about is probably inside JSON_object['key1']['key2']['value2'] because the length of that value formatted as a string is very large.


回答 18

使用此功能:

def pretty_dict(d, n=1):
    for k in d:
        print(" "*n + k)
        try:
            pretty_dict(d[k], n=n+4)
        except TypeError:
            continue

这样称呼它:

pretty_dict(mydict)

Use this function:

def pretty_dict(d, n=1):
    for k in d:
        print(" "*n + k)
        try:
            pretty_dict(d[k], n=n+4)
        except TypeError:
            continue

Call it like this:

pretty_dict(mydict)

回答 19

这是我在处理需要在.txt文件中编写字典的类时想到的:

@staticmethod
def _pretty_write_dict(dictionary):

    def _nested(obj, level=1):
        indentation_values = "\t" * level
        indentation_braces = "\t" * (level - 1)
        if isinstance(obj, dict):
            return "{\n%(body)s%(indent_braces)s}" % {
                "body": "".join("%(indent_values)s\'%(key)s\': %(value)s,\n" % {
                    "key": str(key),
                    "value": _nested(value, level + 1),
                    "indent_values": indentation_values
                } for key, value in obj.items()),
                "indent_braces": indentation_braces
            }
        if isinstance(obj, list):
            return "[\n%(body)s\n%(indent_braces)s]" % {
                "body": "".join("%(indent_values)s%(value)s,\n" % {
                    "value": _nested(value, level + 1),
                    "indent_values": indentation_values
                } for value in obj),
                "indent_braces": indentation_braces
            }
        else:
            return "\'%(value)s\'" % {"value": str(obj)}

    dict_text = _nested(dictionary)
    return dict_text

现在,如果我们有这样的字典:

some_dict = {'default': {'ENGINE': [1, 2, 3, {'some_key': {'some_other_key': 'some_value'}}], 'NAME': 'some_db_name', 'PORT': '', 'HOST': 'localhost', 'USER': 'some_user_name', 'PASSWORD': 'some_password', 'OPTIONS': {'init_command': 'SET foreign_key_checks = 0;'}}}

我们做:

print(_pretty_write_dict(some_dict))

我们得到:

{
    'default': {
        'ENGINE': [
            '1',
            '2',
            '3',
            {
                'some_key': {
                    'some_other_key': 'some_value',
                },
            },
        ],
        'NAME': 'some_db_name',
        'OPTIONS': {
            'init_command': 'SET foreign_key_checks = 0;',
        },
        'HOST': 'localhost',
        'USER': 'some_user_name',
        'PASSWORD': 'some_password',
        'PORT': '',
    },
}

This is what I came up with while working on a class that needed to write a dictionary in a .txt file:

@staticmethod
def _pretty_write_dict(dictionary):

    def _nested(obj, level=1):
        indentation_values = "\t" * level
        indentation_braces = "\t" * (level - 1)
        if isinstance(obj, dict):
            return "{\n%(body)s%(indent_braces)s}" % {
                "body": "".join("%(indent_values)s\'%(key)s\': %(value)s,\n" % {
                    "key": str(key),
                    "value": _nested(value, level + 1),
                    "indent_values": indentation_values
                } for key, value in obj.items()),
                "indent_braces": indentation_braces
            }
        if isinstance(obj, list):
            return "[\n%(body)s\n%(indent_braces)s]" % {
                "body": "".join("%(indent_values)s%(value)s,\n" % {
                    "value": _nested(value, level + 1),
                    "indent_values": indentation_values
                } for value in obj),
                "indent_braces": indentation_braces
            }
        else:
            return "\'%(value)s\'" % {"value": str(obj)}

    dict_text = _nested(dictionary)
    return dict_text

Now, if we have a dictionary like this:

some_dict = {'default': {'ENGINE': [1, 2, 3, {'some_key': {'some_other_key': 'some_value'}}], 'NAME': 'some_db_name', 'PORT': '', 'HOST': 'localhost', 'USER': 'some_user_name', 'PASSWORD': 'some_password', 'OPTIONS': {'init_command': 'SET foreign_key_checks = 0;'}}}

And we do:

print(_pretty_write_dict(some_dict))

We get:

{
    'default': {
        'ENGINE': [
            '1',
            '2',
            '3',
            {
                'some_key': {
                    'some_other_key': 'some_value',
                },
            },
        ],
        'NAME': 'some_db_name',
        'OPTIONS': {
            'init_command': 'SET foreign_key_checks = 0;',
        },
        'HOST': 'localhost',
        'USER': 'some_user_name',
        'PASSWORD': 'some_password',
        'PORT': '',
    },
}

Python字典理解

问题:Python字典理解

是否可以在Python(用于键)中创建字典理解?

在没有列表理解的情况下,您可以使用以下内容:

l = []
for n in range(1, 11):
    l.append(n)

我们可以将其简化为列表理解:l = [n for n in range(1, 11)]

但是,说我想将字典的键设置为相同的值。我可以:

d = {}
for n in range(1, 11):
     d[n] = True # same value for each

我已经试过了:

d = {}
d[i for i in range(1, 11)] = True

不过,我得到一个SyntaxErrorfor

另外(我不需要这部分,只是想知道),您能否将字典的键设置为一堆不同的值,例如:

d = {}
for n in range(1, 11):
    d[n] = n

字典理解有可能吗?

d = {}
d[i for i in range(1, 11)] = [x for x in range(1, 11)]

这也引发了SyntaxErrorfor

Is it possible to create a dictionary comprehension in Python (for the keys)?

Without list comprehensions, you can use something like this:

l = []
for n in range(1, 11):
    l.append(n)

We can shorten this to a list comprehension: l = [n for n in range(1, 11)].

However, say I want to set a dictionary’s keys to the same value. I can do:

d = {}
for n in range(1, 11):
     d[n] = True # same value for each

I’ve tried this:

d = {}
d[i for i in range(1, 11)] = True

However, I get a SyntaxError on the for.

In addition (I don’t need this part, but just wondering), can you set a dictionary’s keys to a bunch of different values, like this:

d = {}
for n in range(1, 11):
    d[n] = n

Is this possible with a dictionary comprehension?

d = {}
d[i for i in range(1, 11)] = [x for x in range(1, 11)]

This also raises a SyntaxError on the for.


回答 0

Python 2.7+中字典理解功能,但是它们不能完全按照您尝试的方式工作。就像列表理解一样,他们创建了一个字典。您不能使用它们将键添加到现有字典中。此外,您还必须指定键和值,尽管您当然可以根据需要指定一个哑数值。

>>> d = {n: n**2 for n in range(5)}
>>> print d
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

如果要将它们全部设置为True:

>>> d = {n: True for n in range(5)}
>>> print d
{0: True, 1: True, 2: True, 3: True, 4: True}

您似乎想要的是一种在现有字典上一次设置多个键的方法。没有直接的捷径。您可以像已经显示的那样循环,也可以使用字典理解来创建具有新值的新字典,然后oldDict.update(newDict)将新值合并到旧字典中。

There are dictionary comprehensions in Python 2.7+, but they don’t work quite the way you’re trying. Like a list comprehension, they create a new dictionary; you can’t use them to add keys to an existing dictionary. Also, you have to specify the keys and values, although of course you can specify a dummy value if you like.

>>> d = {n: n**2 for n in range(5)}
>>> print d
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

If you want to set them all to True:

>>> d = {n: True for n in range(5)}
>>> print d
{0: True, 1: True, 2: True, 3: True, 4: True}

What you seem to be asking for is a way to set multiple keys at once on an existing dictionary. There’s no direct shortcut for that. You can either loop like you already showed, or you could use a dictionary comprehension to create a new dict with the new values, and then do oldDict.update(newDict) to merge the new values into the old dict.


回答 1

您可以使用dict.fromkeys类方法…

>>> dict.fromkeys(range(5), True)
{0: True, 1: True, 2: True, 3: True, 4: True}

这是创建所有键都映射到相同值的字典的最快方法。

但是千万不能用可变对象使用

d = dict.fromkeys(range(5), [])
# {0: [], 1: [], 2: [], 3: [], 4: []}
d[1].append(2)
# {0: [2], 1: [2], 2: [2], 3: [2], 4: [2]} !!!

如果您实际上不需要初始化所有键,那么a defaultdict也可能很有用:

from collections import defaultdict
d = defaultdict(True)

要回答第二部分,您需要的是dict理解:

{k: k for k in range(10)}

您可能不应该这样做,但也可以创建一个子类,如果您重写dict它,该子类的工作原理类似于a :defaultdict__missing__

>>> class KeyDict(dict):
...    def __missing__(self, key):
...       #self[key] = key  # Maybe add this also?
...       return key
... 
>>> d = KeyDict()
>>> d[1]
1
>>> d[2]
2
>>> d[3]
3
>>> print(d)
{}

You can use the dict.fromkeys class method …

>>> dict.fromkeys(range(5), True)
{0: True, 1: True, 2: True, 3: True, 4: True}

This is the fastest way to create a dictionary where all the keys map to the same value.

But do not use this with mutable objects:

d = dict.fromkeys(range(5), [])
# {0: [], 1: [], 2: [], 3: [], 4: []}
d[1].append(2)
# {0: [2], 1: [2], 2: [2], 3: [2], 4: [2]} !!!

If you don’t actually need to initialize all the keys, a defaultdict might be useful as well:

from collections import defaultdict
d = defaultdict(True)

To answer the second part, a dict-comprehension is just what you need:

{k: k for k in range(10)}

You probably shouldn’t do this but you could also create a subclass of dict which works somewhat like a defaultdict if you override __missing__:

>>> class KeyDict(dict):
...    def __missing__(self, key):
...       #self[key] = key  # Maybe add this also?
...       return key
... 
>>> d = KeyDict()
>>> d[1]
1
>>> d[2]
2
>>> d[3]
3
>>> print(d)
{}

回答 2

>>> {i:i for i in range(1, 11)}
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10}
>>> {i:i for i in range(1, 11)}
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10}

回答 3

我真的很喜欢@mgilson注释,因为如果您有两个可迭代项,一个与键相对应,另一个与值相对应,则还可以执行以下操作。

keys = ['a', 'b', 'c']
values = [1, 2, 3]
d = dict(zip(keys, values))

给予

d = {‘a’:1,’b’:2,’c’:3}

I really like the @mgilson comment, since if you have a two iterables, one that corresponds to the keys and the other the values, you can also do the following.

keys = ['a', 'b', 'c']
values = [1, 2, 3]
d = dict(zip(keys, values))

giving

d = {‘a’: 1, ‘b’: 2, ‘c’: 3}


回答 4

在元组列表上使用dict(),此解决方案将允许您在每个列表中具有任意值,只要它们的长度相同

i_s = range(1, 11)
x_s = range(1, 11)
# x_s = range(11, 1, -1) # Also works
d = dict([(i_s[index], x_s[index], ) for index in range(len(i_s))])

Use dict() on a list of tuples, this solution will allow you to have arbitrary values in each list, so long as they are the same length

i_s = range(1, 11)
x_s = range(1, 11)
# x_s = range(11, 1, -1) # Also works
d = dict([(i_s[index], x_s[index], ) for index in range(len(i_s))])

回答 5

考虑以下示例,该示例使用字典理解来计算列表中单词的出现

my_list = ['hello', 'hi', 'hello', 'today', 'morning', 'again', 'hello']
my_dict = {k:my_list.count(k) for k in my_list}
print(my_dict)

结果是

{'again': 1, 'hi': 1, 'hello': 3, 'today': 1, 'morning': 1}

Consider this example of counting the occurrence of words in a list using dictionary comprehension

my_list = ['hello', 'hi', 'hello', 'today', 'morning', 'again', 'hello']
my_dict = {k:my_list.count(k) for k in my_list}
print(my_dict)

And the result is

{'again': 1, 'hi': 1, 'hello': 3, 'today': 1, 'morning': 1}

回答 6

列表理解的主要目的是在不更改或破坏原始列表的基础上创建基于另一个列表的新列表。

而不是写作

    l = []
    for n in range(1, 11):
        l.append(n)

要么

    l = [n for n in range(1, 11)]

你应该只写

    l = range(1, 11)

在两个最重要的代码块中,您将创建一个新列表,对其进行迭代并仅返回每个元素。这只是创建列表副本的一种昂贵方法。

要获得一个新的字典,并且根据另一个字典将所有键设置为相同的值,请执行以下操作:

    old_dict = {'a': 1, 'c': 3, 'b': 2}
    new_dict = { key:'your value here' for key in old_dict.keys()}

您收到SyntaxError的原因是,在编写时

    d = {}
    d[i for i in range(1, 11)] = True

您基本上是在说:“将我的密钥’i for range(1,11)’中的i设置为True”和“ i for i in range(1,11)中的”)不是有效的密钥,这只是语法错误。如果将dicts支持的列表作为键,您将执行以下操作

    d[[i for i in range(1, 11)]] = True

并不是

    d[i for i in range(1, 11)] = True

但是列表不可散列,因此您不能将它们用作字典键。

The main purpose of a list comprehension is to create a new list based on another one without changing or destroying the original list.

Instead of writing

    l = []
    for n in range(1, 11):
        l.append(n)

or

    l = [n for n in range(1, 11)]

you should write only

    l = range(1, 11)

In the two top code blocks you’re creating a new list, iterating through it and just returning each element. It’s just an expensive way of creating a list copy.

To get a new dictionary with all keys set to the same value based on another dict, do this:

    old_dict = {'a': 1, 'c': 3, 'b': 2}
    new_dict = { key:'your value here' for key in old_dict.keys()}

You’re receiving a SyntaxError because when you write

    d = {}
    d[i for i in range(1, 11)] = True

you’re basically saying: “Set my key ‘i for i in range(1, 11)’ to True” and “i for i in range(1, 11)” is not a valid key, it’s just a syntax error. If dicts supported lists as keys, you would do something like

    d[[i for i in range(1, 11)]] = True

and not

    d[i for i in range(1, 11)] = True

but lists are not hashable, so you can’t use them as dict keys.


回答 7

您不能像这样散列表。试试这个,它使用元组

d[tuple([i for i in range(1,11)])] = True

you can’t hash a list like that. try this instead, it uses tuples

d[tuple([i for i in range(1,11)])] = True