标签归档:lookup

pandas loc vs. iloc vs. ix vs. at vs. iat?

问题:pandas loc vs. iloc vs. ix vs. at vs. iat?

最近开始从我的安全地方(R)分支到Python,并且对中的单元格本地化/选择感到有些困惑Pandas。我已经阅读了文档,但仍在努力了解各种本地化/选择选项的实际含义。

  • 我为什么应该使用.loc.iloc超过最一般的选择.ix
  • 我的理解是.locilocat,和iat可以提供一些保证正确性是.ix不能提供的,但我也看到了在那里.ix往往是一刀切最快的解决方案。
  • 请说明使用除.ix?以外的任何东西背后的现实世界,最佳实践推理。

Recently began branching out from my safe place (R) into Python and and am a bit confused by the cell localization/selection in Pandas. I’ve read the documentation but I’m struggling to understand the practical implications of the various localization/selection options.

  • Is there a reason why I should ever use .loc or .iloc over the most general option .ix?
  • I understand that .loc, iloc, at, and iat may provide some guaranteed correctness that .ix can’t offer, but I’ve also read where .ix tends to be the fastest solution across the board.
  • Please explain the real-world, best-practices reasoning behind utilizing anything other than .ix?

回答 0

loc:仅适用于索引
iloc:适用于位置
ix:您可以从数据获取数据,而无需将其包含在索引
中:获取标量值。这是一个非常快速的定位
获取标量值。这是一个非常快的iloc

http://pyciencia.blogspot.com/2015/05/obtener-y-filtrar-datos-de-un-dataframe.html

注:由于pandas 0.20.0中,.ix索引被弃用赞成更加严格.iloc.loc索引。

loc: only work on index
iloc: work on position
ix: You can get data from dataframe without it being in the index
at: get scalar values. It’s a very fast loc
iat: Get scalar values. It’s a very fast iloc

http://pyciencia.blogspot.com/2015/05/obtener-y-filtrar-datos-de-un-dataframe.html

Note: As of pandas 0.20.0, the .ix indexer is deprecated in favour of the more strict .iloc and .loc indexers.


回答 1

已更新,pandas 0.20因为ix已弃用。这不但表明了如何使用locilocatiatset_value,但如何实现,混合位置/标签基于索引。


loc基于标签
允许您将一维数组作为索引器传递。数组可以是索引或列的切片(子集),也可以是长度与索引或列相等的布尔数组。

特别说明:当传递标量索引器时,loc可以分配以前不存在的新索引或列值。

# label based, but we can use position values
# to get the labels from the index object
df.loc[df.index[2], 'ColName'] = 3

df.loc[df.index[1:3], 'ColName'] = 3

iloc基于位置
类似于,loc除了位置而不是索引值。但是,您不能分配新的列或索引。

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.iloc[2, df.columns.get_loc('ColName')] = 3

df.iloc[2, 4] = 3

df.iloc[:3, 2:4] = 3

at基于标签的
作品与loc标量索引器非常相似。 无法对数组索引器进行操作。 能够!分配新的索引和列。

优势loc是,这是速度更快。
缺点是不能将数组用于索引器。

# label based, but we can use position values
# to get the labels from the index object
df.at[df.index[2], 'ColName'] = 3

df.at['C', 'ColName'] = 3

iat基于位置的
原理相似iloc无法在数组索引器中工作。 不能!分配新的索引和列。

优势iloc是,这是速度更快。
缺点是不能将数组用于索引器。

# position based, but we can get the position
# from the columns object via the `get_loc` method
IBM.iat[2, IBM.columns.get_loc('PNL')] = 3

set_value基于标签的
作品与loc标量索引器非常相似。 无法对数组索引器进行操作。 能够!分配新的索引和列

优势超级快,因为几乎没有开销!
缺点由于pandas没有进行大量安全检查,因此开销很少。 使用风险自负。另外,这也不打算供公众使用。

# label based, but we can use position values
# to get the labels from the index object
df.set_value(df.index[2], 'ColName', 3)

set_valuetakable=True位置,并根据
原理相似iloc无法在数组索引器中工作。 不能!分配新的索引和列。

优势超级快,因为几乎没有开销!
缺点由于pandas没有进行大量安全检查,因此开销很少。 使用风险自负。另外,这也不打算供公众使用。

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.set_value(2, df.columns.get_loc('ColName'), 3, takable=True)

Updated for pandas 0.20 given that ix is deprecated. This demonstrates not only how to use loc, iloc, at, iat, set_value, but how to accomplish, mixed positional/label based indexing.


loclabel based
Allows you to pass 1-D arrays as indexers. Arrays can be either slices (subsets) of the index or column, or they can be boolean arrays which are equal in length to the index or columns.

Special Note: when a scalar indexer is passed, loc can assign a new index or column value that didn’t exist before.

# label based, but we can use position values
# to get the labels from the index object
df.loc[df.index[2], 'ColName'] = 3

df.loc[df.index[1:3], 'ColName'] = 3

ilocposition based
Similar to loc except with positions rather that index values. However, you cannot assign new columns or indices.

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.iloc[2, df.columns.get_loc('ColName')] = 3

df.iloc[2, 4] = 3

df.iloc[:3, 2:4] = 3

atlabel based
Works very similar to loc for scalar indexers. Cannot operate on array indexers. Can! assign new indices and columns.

Advantage over loc is that this is faster.
Disadvantage is that you can’t use arrays for indexers.

# label based, but we can use position values
# to get the labels from the index object
df.at[df.index[2], 'ColName'] = 3

df.at['C', 'ColName'] = 3

iatposition based
Works similarly to iloc. Cannot work in array indexers. Cannot! assign new indices and columns.

Advantage over iloc is that this is faster.
Disadvantage is that you can’t use arrays for indexers.

# position based, but we can get the position
# from the columns object via the `get_loc` method
IBM.iat[2, IBM.columns.get_loc('PNL')] = 3

set_valuelabel based
Works very similar to loc for scalar indexers. Cannot operate on array indexers. Can! assign new indices and columns

Advantage Super fast, because there is very little overhead!
Disadvantage There is very little overhead because pandas is not doing a bunch of safety checks. Use at your own risk. Also, this is not intended for public use.

# label based, but we can use position values
# to get the labels from the index object
df.set_value(df.index[2], 'ColName', 3)

set_value with takable=Trueposition based
Works similarly to iloc. Cannot work in array indexers. Cannot! assign new indices and columns.

Advantage Super fast, because there is very little overhead!
Disadvantage There is very little overhead because pandas is not doing a bunch of safety checks. Use at your own risk. Also, this is not intended for public use.

# position based, but we can get the position
# from the columns object via the `get_loc` method
df.set_value(2, df.columns.get_loc('ColName'), 3, takable=True)

回答 2

熊猫从DataFrame中进行选择的主要方式有两种。

  • 标签
  • 整数位置

该文档使用位置一词来指代整数位置。我不喜欢这个术语,因为我觉得它很混乱。整数位置更具描述性,正好.iloc代表该位置。此处的关键字是INTEGER-按整数位置选择时必须使用整数。

在显示摘要之前,让我们确保…

.ix已弃用且含糊不清,切勿使用

熊猫有三个主要的索引器。我们有索引运算符本身(括号[].loc,和.iloc。让我们总结一下:

  • []-主要选择列的子集,但也可以选择行。无法同时选择行和列。
  • .loc -仅按标签选择行和列的子集
  • .iloc -仅按整数位置选择行和列的子集

我几乎从未使用过,.at或者.iat因为它们没有添加任何附加功能并且只增加了一点性能。除非您有一个对时间敏感的应用程序,否则我不建议您使用它们。无论如何,我们有他们的摘要:

  • .at 仅通过标签在DataFrame中选择单个标量值
  • .iat 仅通过整数位置选择DataFrame中的单个标量值

除了按标签和整数位置进行选择外,还存在布尔选择(也称为布尔索引)


解释.loc,,.iloc布尔选择.at.iat的示例如下所示

我们将首先关注.loc和之间的差异.iloc。在讨论差异之前,必须了解DataFrame具有用于帮助标识每一列和每一行的标签,这一点很重要。让我们看一个示例DataFrame:

df = pd.DataFrame({'age':[30, 2, 12, 4, 32, 33, 69],
                   'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
                   'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
                   'height':[165, 70, 120, 80, 180, 172, 150],
                   'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
                   'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
                   },
                  index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])

所有粗体字均为标签。标签,agecolorfoodheightscorestate被用于。其他标签,JaneNickAaronPenelopeDeanChristinaCornelia用作标签的行。这些行标签统称为index


在DataFrame中选择特定行的主要方式是使用.loc.iloc索引器。这些索引器中的每一个也可以用于同时选择列,但是现在只关注行比较容易。此外,每个索引器都使用紧跟其名称的一组括号进行选择。

.loc仅通过标签选择数据

我们将首先讨论.loc仅通过索引或列标签选择数据的索引器。在示例DataFrame中,我们提供了有意义的名称作为索引值。许多DataFrame都没有任何有意义的名称,而是默认为0到n-1之间的整数,其中n是DataFrame的长度(行数)。

您可以使用三种输入中的许多不同.loc,它们是

  • 一串
  • 字符串列表
  • 使用字符串作为起始值和终止值的切片符号

用带字符串的.loc选择单行

要选择单行数据,请将索引标签放在后面的括号内.loc

df.loc['Penelope']

这将数据行作为系列返回

age           4
color     white
food      Apple
height       80
score       3.3
state        AL
Name: Penelope, dtype: object

使用.loc与字符串列表选择多行

df.loc[['Cornelia', 'Jane', 'Dean']]

这将返回一个DataFrame,其中的数据行按列表中指定的顺序进行:

使用带有切片符号的.loc选择多行

切片符号由开始值,停止值和步长值定义。按标签切片时,大熊猫在返回值中包含停止值。以下是从亚伦到迪恩(含)的片段。它的步长未明确定义,但默认为1。

df.loc['Aaron':'Dean']

可以采用与Python列表相同的方式获取复杂的切片。

.iloc仅按整数位置选择数据

现在转到.iloc。DataFrame中数据的每一行和每一列都有一个定义它的整数位置。这是输出中直观显示的标签的补充。整数位置就是从0开始从顶部/左侧开始的行数/列数。

您可以使用三种输入中的许多不同.iloc,它们是

  • 一个整数
  • 整数列表
  • 使用整数作为起始值和终止值的切片符号

用带整数的.iloc选择单行

df.iloc[4]

这将返回第5行(整数位置4)为系列

age           32
color       gray
food      Cheese
height       180
score        1.8
state         AK
Name: Dean, dtype: object

用.iloc选择带有整数列表的多行

df.iloc[[2, -2]]

这将返回第三行和倒数第二行的DataFrame:

使用带切片符号的.iloc选择多行

df.iloc[:5:3]


使用.loc和.iloc同时选择行和列

两者的一项出色功能.loc/.iloc是它们可以同时选择行和列。在上面的示例中,所有列都是从每个选择中返回的。我们可以选择输入类型与行相同的列。我们只需要用逗号分隔行和列的选择即可。

例如,我们可以选择Jane行和Dean行,它们的高度,得分和状态如下:

df.loc[['Jane', 'Dean'], 'height':]

这对行使用标签列表,对列使用切片符号

我们自然可以.iloc只使用整数来执行类似的操作。

df.iloc[[1,4], 2]
Nick      Lamb
Dean    Cheese
Name: food, dtype: object

带标签和整数位置的同时选择

.ix用来与标签和整数位置同时进行选择,这很有用,但有时会造成混淆和模棱两可,值得庆幸的是,它已弃用。如果您需要混合使用标签和整数位置进行选择,则必须同时选择标签或整数位置。

例如,如果我们要选择行Nick以及第Cornelia2列和第4列,则可以.loc通过以下方式将整数转换为标签来使用:

col_names = df.columns[[2, 4]]
df.loc[['Nick', 'Cornelia'], col_names] 

或者,可以使用get_locindex方法将索引标签转换为整数。

labels = ['Nick', 'Cornelia']
index_ints = [df.index.get_loc(label) for label in labels]
df.iloc[index_ints, [2, 4]]

布尔选择

.loc索引器还可以进行布尔选择。例如,如果我们有兴趣查找年龄在30岁以上的所有行并仅返回foodscore列,则可以执行以下操作:

df.loc[df['age'] > 30, ['food', 'score']] 

您可以使用复制它,.iloc但是不能将其传递为布尔系列。您必须将boolean Series转换为numpy数组,如下所示:

df.iloc[(df['age'] > 30).values, [2, 4]] 

选择所有行

可以.loc/.iloc仅用于列选择。您可以使用冒号选择所有行,如下所示:

df.loc[:, 'color':'score':2]


索引运算符[]可以切片也可以选择行和列,但不能同时选择。

大多数人都熟悉DataFrame索引运算符的主要目的,即选择列。字符串选择单个列作为系列,字符串列表选择多个列作为DataFrame。

df['food']

Jane          Steak
Nick           Lamb
Aaron         Mango
Penelope      Apple
Dean         Cheese
Christina     Melon
Cornelia      Beans
Name: food, dtype: object

使用列表选择多个列

df[['food', 'score']]

人们所不熟悉的是,当使用切片符号时,选择是通过行标签或整数位置进行的。这非常令人困惑,我几乎从未使用过,但是确实可以使用。

df['Penelope':'Christina'] # slice rows by label

df[2:6:2] # slice rows by integer location

.loc/.iloc选择行的明确性是高度首选的。单独的索引运算符无法同时选择行和列。

df[3:5, 'color']
TypeError: unhashable type: 'slice'

.at和选择.iat

选择与.at几乎相同,.loc但仅在DataFrame中选择一个“单元”。我们通常将此单元称为标量值。要使用.at,请将行标签和列标签都传递给它,并用逗号分隔。

df.at['Christina', 'color']
'black'

选择与.iat几乎相同,.iloc但仅选择一个标量值。您必须为行和列位置都传递一个整数

df.iat[2, 5]
'FL'

There are two primary ways that pandas makes selections from a DataFrame.

  • By Label
  • By Integer Location

The documentation uses the term position for referring to integer location. I do not like this terminology as I feel it is confusing. Integer location is more descriptive and is exactly what .iloc stands for. The key word here is INTEGER – you must use integers when selecting by integer location.

Before showing the summary let’s all make sure that …

.ix is deprecated and ambiguous and should never be used

There are three primary indexers for pandas. We have the indexing operator itself (the brackets []), .loc, and .iloc. Let’s summarize them:

  • [] – Primarily selects subsets of columns, but can select rows as well. Cannot simultaneously select rows and columns.
  • .loc – selects subsets of rows and columns by label only
  • .iloc – selects subsets of rows and columns by integer location only

I almost never use .at or .iat as they add no additional functionality and with just a small performance increase. I would discourage their use unless you have a very time-sensitive application. Regardless, we have their summary:

  • .at selects a single scalar value in the DataFrame by label only
  • .iat selects a single scalar value in the DataFrame by integer location only

In addition to selection by label and integer location, boolean selection also known as boolean indexing exists.


Examples explaining .loc, .iloc, boolean selection and .at and .iat are shown below

We will first focus on the differences between .loc and .iloc. Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each row. Let’s take a look at a sample DataFrame:

df = pd.DataFrame({'age':[30, 2, 12, 4, 32, 33, 69],
                   'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
                   'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
                   'height':[165, 70, 120, 80, 180, 172, 150],
                   'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
                   'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
                   },
                  index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])

All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used as labels for the rows. Collectively, these row labels are known as the index.


The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.

.loc selects data only by labels

We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length(number of rows) of the DataFrame.

There are many different inputs you can use for .loc three out of them are

  • A string
  • A list of strings
  • Slice notation using strings as the start and stop values

Selecting a single row with .loc with a string

To select a single row of data, place the index label inside of the brackets following .loc.

df.loc['Penelope']

This returns the row of data as a Series

age           4
color     white
food      Apple
height       80
score       3.3
state        AL
Name: Penelope, dtype: object

Selecting multiple rows with .loc with a list of strings

df.loc[['Cornelia', 'Jane', 'Dean']]

This returns a DataFrame with the rows in the order specified in the list:

Selecting multiple rows with .loc with slice notation

Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.

df.loc['Aaron':'Dean']

Complex slices can be taken in the same manner as Python lists.

.iloc selects data only by integer location

Let’s now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.

There are many different inputs you can use for .iloc three out of them are

  • An integer
  • A list of integers
  • Slice notation using integers as the start and stop values

Selecting a single row with .iloc with an integer

df.iloc[4]

This returns the 5th row (integer location 4) as a Series

age           32
color       gray
food      Cheese
height       180
score        1.8
state         AK
Name: Dean, dtype: object

Selecting multiple rows with .iloc with a list of integers

df.iloc[[2, -2]]

This returns a DataFrame of the third and second to last rows:

Selecting multiple rows with .iloc with slice notation

df.iloc[:5:3]


Simultaneous selection of rows and columns with .loc and .iloc

One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.

For example, we can select rows Jane, and Dean with just the columns height, score and state like this:

df.loc[['Jane', 'Dean'], 'height':]

This uses a list of labels for the rows and slice notation for the columns

We can naturally do similar operations with .iloc using only integers.

df.iloc[[1,4], 2]
Nick      Lamb
Dean    Cheese
Name: food, dtype: object

Simultaneous selection with labels and integer location

.ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.

For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:

col_names = df.columns[[2, 4]]
df.loc[['Nick', 'Cornelia'], col_names] 

Or alternatively, convert the index labels to integers with the get_loc index method.

labels = ['Nick', 'Cornelia']
index_ints = [df.index.get_loc(label) for label in labels]
df.iloc[index_ints, [2, 4]]

Boolean Selection

The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows where age is above 30 and return just the food and score columns we can do the following:

df.loc[df['age'] > 30, ['food', 'score']] 

You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:

df.iloc[(df['age'] > 30).values, [2, 4]] 

Selecting all rows

It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:

df.loc[:, 'color':'score':2]


The indexing operator, [], can slice can select rows and columns too but not simultaneously.

Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.

df['food']

Jane          Steak
Nick           Lamb
Aaron         Mango
Penelope      Apple
Dean         Cheese
Christina     Melon
Cornelia      Beans
Name: food, dtype: object

Using a list selects multiple columns

df[['food', 'score']]

What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.

df['Penelope':'Christina'] # slice rows by label

df[2:6:2] # slice rows by integer location

The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.

df[3:5, 'color']
TypeError: unhashable type: 'slice'

Selection by .at and .iat

Selection with .at is nearly identical to .loc but it only selects a single ‘cell’ in your DataFrame. We usually refer to this cell as a scalar value. To use .at, pass it both a row and column label separated by a comma.

df.at['Christina', 'color']
'black'

Selection with .iat is nearly identical to .iloc but it only selects a single scalar value. You must pass it an integer for both the row and column locations

df.iat[2, 5]
'FL'

回答 3

df = pd.DataFrame({'A':['a', 'b', 'c'], 'B':[54, 67, 89]}, index=[100, 200, 300])

df

                        A   B
                100     a   54
                200     b   67
                300     c   89
In [19]:    
df.loc[100]

Out[19]:
A     a
B    54
Name: 100, dtype: object

In [20]:    
df.iloc[0]

Out[20]:
A     a
B    54
Name: 100, dtype: object

In [24]:    
df2 = df.set_index([df.index,'A'])
df2

Out[24]:
        B
    A   
100 a   54
200 b   67
300 c   89

In [25]:    
df2.ix[100, 'a']

Out[25]:    
B    54
Name: (100, a), dtype: int64
df = pd.DataFrame({'A':['a', 'b', 'c'], 'B':[54, 67, 89]}, index=[100, 200, 300])

df

                        A   B
                100     a   54
                200     b   67
                300     c   89
In [19]:    
df.loc[100]

Out[19]:
A     a
B    54
Name: 100, dtype: object

In [20]:    
df.iloc[0]

Out[20]:
A     a
B    54
Name: 100, dtype: object

In [24]:    
df2 = df.set_index([df.index,'A'])
df2

Out[24]:
        B
    A   
100 a   54
200 b   67
300 c   89

In [25]:    
df2.ix[100, 'a']

Out[25]:    
B    54
Name: (100, a), dtype: int64

回答 4

让我们从这个小df开始:

import pandas as pd
import time as tm
import numpy as np
n=10
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))

我们会这样

df
Out[25]: 
        0   1   2   3   4   5   6   7   8   9
    0   0   1   2   3   4   5   6   7   8   9
    1  10  11  12  13  14  15  16  17  18  19
    2  20  21  22  23  24  25  26  27  28  29
    3  30  31  32  33  34  35  36  37  38  39
    4  40  41  42  43  44  45  46  47  48  49
    5  50  51  52  53  54  55  56  57  58  59
    6  60  61  62  63  64  65  66  67  68  69
    7  70  71  72  73  74  75  76  77  78  79
    8  80  81  82  83  84  85  86  87  88  89
    9  90  91  92  93  94  95  96  97  98  99

有了这个我们有:

df.iloc[3,3]
Out[33]: 33

df.iat[3,3]
Out[34]: 33

df.iloc[:3,:3]
Out[35]: 
    0   1   2   3
0   0   1   2   3
1  10  11  12  13
2  20  21  22  23
3  30  31  32  33



df.iat[:3,:3]
Traceback (most recent call last):
   ... omissis ...
ValueError: At based indexing on an integer index can only have integer indexers

因此,我们不能将.iat用于子集,而只能在其中使用.iloc。

但是,让我们尝试从较大的df中进行选择,并检查速度…

# -*- coding: utf-8 -*-
"""
Created on Wed Feb  7 09:58:39 2018

@author: Fabio Pomi
"""

import pandas as pd
import time as tm
import numpy as np
n=1000
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))
t1=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iloc[j,i]
t2=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iat[j,i]
t3=tm.time()
loc=t2-t1
at=t3-t2
prc = loc/at *100
print('\nloc:%f at:%f prc:%f' %(loc,at,prc))

loc:10.485600 at:7.395423 prc:141.784987

因此,使用.loc我们可以管理子集,并且仅使用单个标量即可使用.loc,但是.at比.loc更快

:-)

Let’s start with this small df:

import pandas as pd
import time as tm
import numpy as np
n=10
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))

We’ll so have

df
Out[25]: 
        0   1   2   3   4   5   6   7   8   9
    0   0   1   2   3   4   5   6   7   8   9
    1  10  11  12  13  14  15  16  17  18  19
    2  20  21  22  23  24  25  26  27  28  29
    3  30  31  32  33  34  35  36  37  38  39
    4  40  41  42  43  44  45  46  47  48  49
    5  50  51  52  53  54  55  56  57  58  59
    6  60  61  62  63  64  65  66  67  68  69
    7  70  71  72  73  74  75  76  77  78  79
    8  80  81  82  83  84  85  86  87  88  89
    9  90  91  92  93  94  95  96  97  98  99

With this we have:

df.iloc[3,3]
Out[33]: 33

df.iat[3,3]
Out[34]: 33

df.iloc[:3,:3]
Out[35]: 
    0   1   2   3
0   0   1   2   3
1  10  11  12  13
2  20  21  22  23
3  30  31  32  33



df.iat[:3,:3]
Traceback (most recent call last):
   ... omissis ...
ValueError: At based indexing on an integer index can only have integer indexers

Thus we cannot use .iat for subset, where we must use .iloc only.

But let’s try both to select from a larger df and let’s check the speed …

# -*- coding: utf-8 -*-
"""
Created on Wed Feb  7 09:58:39 2018

@author: Fabio Pomi
"""

import pandas as pd
import time as tm
import numpy as np
n=1000
a=np.arange(0,n**2)
df=pd.DataFrame(a.reshape(n,n))
t1=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iloc[j,i]
t2=tm.time()
for j in df.index:
    for i in df.columns:
        a=df.iat[j,i]
t3=tm.time()
loc=t2-t1
at=t3-t2
prc = loc/at *100
print('\nloc:%f at:%f prc:%f' %(loc,at,prc))

loc:10.485600 at:7.395423 prc:141.784987

So with .loc we can manage subsets and with .at only a single scalar, but .at is faster than .loc

:-)


如何向字典添加新键?

问题:如何向字典添加新键?

创建字典后是否可以向Python字典添加关键字?

它似乎没有.add()方法。

Is it possible to add a key to a Python dictionary after it has been created?

It doesn’t seem to have an .add() method.


回答 0

d = {'key': 'value'}
print(d)
# {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d)
# {'key': 'value', 'mynewkey': 'mynewvalue'}
d = {'key': 'value'}
print(d)
# {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d)
# {'key': 'value', 'mynewkey': 'mynewvalue'}

回答 1

要同时添加多个键,请使用dict.update()

>>> x = {1:2}
>>> print(x)
{1: 2}

>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}

对于添加单个密钥,可接受的答案具有较少的计算开销。

To add multiple keys simultaneously, use dict.update():

>>> x = {1:2}
>>> print(x)
{1: 2}

>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}

For adding a single key, the accepted answer has less computational overhead.


回答 2

我想整合有关Python字典的信息:

创建一个空字典

data = {}
# OR
data = dict()

用初始值创建字典

data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}

插入/更新单个值

data['a'] = 1  # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)

插入/更新多个值

data.update({'c':3,'d':4})  # Updates 'c' and adds 'd'

创建合并字典而无需修改原始字典

data3 = {}
data3.update(data)  # Modifies data3, not data
data3.update(data2)  # Modifies data3, not data2

删除字典中的项目

del data[key]  # Removes specific element in a dictionary
data.pop(key)  # Removes the key & returns the value
data.clear()  # Clears entire dictionary

检查密钥是否已在字典中

key in data

遍历字典中的对

for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys

从两个列表创建字典

data = dict(zip(list_with_keys, list_with_values))

Python 3.5的新功能

创建合并字典而不修改原始字典:

这使用了称为字典解包的新功能。

data = {**data1, **data2, **data3}

Python 3.9的新功能

更新或添加现有字典的值

现在,更新运算符 |=可用于词典:

data |= {'c':3,'d':4}

创建合并字典而无需修改原始字典

合并操作 |现在工作的字典:

data = data1 | {'c':3,'d':4}

随时添加更多!

I feel like consolidating info about Python dictionaries:

Creating an empty dictionary

data = {}
# OR
data = dict()

Creating a dictionary with initial values

data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}

Inserting/Updating a single value

data['a'] = 1  # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)

Inserting/Updating multiple values

data.update({'c':3,'d':4})  # Updates 'c' and adds 'd'

Creating a merged dictionary without modifying originals

data3 = {}
data3.update(data)  # Modifies data3, not data
data3.update(data2)  # Modifies data3, not data2

Deleting items in dictionary

del data[key]  # Removes specific element in a dictionary
data.pop(key)  # Removes the key & returns the value
data.clear()  # Clears entire dictionary

Check if a key is already in dictionary

key in data

Iterate through pairs in a dictionary

for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys

Create a dictionary from two lists

data = dict(zip(list_with_keys, list_with_values))

New to Python 3.5

Creating a merged dictionary without modifying originals:

This uses a new featrue called dictionary unpacking.

data = {**data1, **data2, **data3}

New to Python 3.9

Update or add values for an existing dictionary

The update operator |= now works for dictionaries:

data |= {'c':3,'d':4}

Creating a merged dictionary without modifying originals

The merge operator | now works for dictionaries:

data = data1 | {'c':3,'d':4}

Feel free to add more!


回答 3

“创建密钥后是否可以向Python字典添加密钥?它似乎没有.add()方法。”

是的,这是可能的,并且它确实具有实现此目的的方法,但是您不想直接使用它。

为了演示如何以及如何不使用它,让我们用dict文字创建一个空的dict {}

my_dict = {}

最佳实践1:下标符号

要使用单个新键和值更新此字典,可以使用下标符号(请参见此处的映射)提供项目分配:

my_dict['new key'] = 'new value'

my_dict 就是现在:

{'new key': 'new value'}

最佳实践2:update方法-2种方法

我们也可以使用update方法高效地使用多个值更新字典。我们可能在dict这里不必要地创建了一个额外的东西,因此我们希望我们dict已经被创建并来自另一个目的或用于另一个目的:

my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})

my_dict 就是现在:

{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}

使用update方法执行此操作的另一种有效方法是使用关键字参数,但是由于它们必须是合法的python单词,因此您不能使用空格或特殊符号或以数字开头的名称,但是许多人认为这是更易读的方法为字典创建键,在这里我们当然避免创建额外的不必要的键dict

my_dict.update(foo='bar', foo2='baz')

my_dict现在是:

{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value', 
 'foo': 'bar', 'foo2': 'baz'}

因此,现在我们介绍了三种更新Python的Python方法dict


魔术方法,__setitem__以及为什么应避免使用

还有另一种dict您不应该使用的更新__setitem__方法,它使用的方法。这是一个示例,说明了如何使用该__setitem__方法将键值对添加到dict,并演示了使用它的不良性能:

>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}


>>> def f():
...     d = {}
...     for i in xrange(100):
...         d['foo'] = i
... 
>>> def g():
...     d = {}
...     for i in xrange(100):
...         d.__setitem__('foo', i)
... 
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539

因此,我们看到使用下标符号实际上比使用下标符号要快得多__setitem__。做Python式的事情,也就是说,按照预期的方式使用该语言,通常既可读性强又计算效率高。

“Is it possible to add a key to a Python dictionary after it has been created? It doesn’t seem to have an .add() method.”

Yes it is possible, and it does have a method that implements this, but you don’t want to use it directly.

To demonstrate how and how not to use it, let’s create an empty dict with the dict literal, {}:

my_dict = {}

Best Practice 1: Subscript notation

To update this dict with a single new key and value, you can use the subscript notation (see Mappings here) that provides for item assignment:

my_dict['new key'] = 'new value'

my_dict is now:

{'new key': 'new value'}

Best Practice 2: The update method – 2 ways

We can also update the dict with multiple values efficiently as well using the update method. We may be unnecessarily creating an extra dict here, so we hope our dict has already been created and came from or was used for another purpose:

my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})

my_dict is now:

{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}

Another efficient way of doing this with the update method is with keyword arguments, but since they have to be legitimate python words, you can’t have spaces or special symbols or start the name with a number, but many consider this a more readable way to create keys for a dict, and here we certainly avoid creating an extra unnecessary dict:

my_dict.update(foo='bar', foo2='baz')

and my_dict is now:

{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value', 
 'foo': 'bar', 'foo2': 'baz'}

So now we have covered three Pythonic ways of updating a dict.


Magic method, __setitem__, and why it should be avoided

There’s another way of updating a dict that you shouldn’t use, which uses the __setitem__ method. Here’s an example of how one might use the __setitem__ method to add a key-value pair to a dict, and a demonstration of the poor performance of using it:

>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}


>>> def f():
...     d = {}
...     for i in xrange(100):
...         d['foo'] = i
... 
>>> def g():
...     d = {}
...     for i in xrange(100):
...         d.__setitem__('foo', i)
... 
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539

So we see that using the subscript notation is actually much faster than using __setitem__. Doing the Pythonic thing, that is, using the language in the way it was intended to be used, usually is both more readable and computationally efficient.


回答 4

dictionary[key] = value
dictionary[key] = value

回答 5

如果要在字典中添加字典,可以使用此方法。

示例:将新条目添加到词典和子词典中

dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)

输出:

{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}

注意: Python要求您首先添加一个子

dictionary["dictionary_within_a_dictionary"] = {}

在添加条目之前。

If you want to add a dictionary within a dictionary you can do it this way.

Example: Add a new entry to your dictionary & sub dictionary

dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)

Output:

{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}

NOTE: Python requires that you first add a sub

dictionary["dictionary_within_a_dictionary"] = {}

before adding entries.


回答 6

正统语法为d[key] = value,但是如果键盘缺少方括号键,则可以执行以下操作:

d.__setitem__(key, value)

实际上,定义__getitem____setitem__方法是使自己的类支持方括号语法的方法。参见https://python.developpez.com/cours/DiveIntoPython/php/endiveintopython/object_linked_framework/special_class_methods.php

The orthodox syntax is d[key] = value, but if your keyboard is missing the square bracket keys you could do:

d.__setitem__(key, value)

In fact, defining __getitem__ and __setitem__ methods is how you can make your own class support the square bracket syntax. See https://python.developpez.com/cours/DiveIntoPython/php/endiveintopython/object_oriented_framework/special_class_methods.php


回答 7

您可以创建一个:

class myDict(dict):

    def __init__(self):
        self = dict()

    def add(self, key, value):
        self[key] = value

## example

myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)

给出:

>>> 
{'apples': 6, 'bananas': 3}

You can create one:

class myDict(dict):

    def __init__(self):
        self = dict()

    def add(self, key, value):
        self[key] = value

## example

myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)

Gives:

>>> 
{'apples': 6, 'bananas': 3}

回答 8

这个受欢迎的问题解决了合并词典和语法的功能方法。ab

以下是一些更简单的方法(已在Python 3中测试)…

c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )

注意:以上第一种方法仅在输入的键b为字符串时才有效。

要添加或修改单个元素b字典将仅包含一个元素…

c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'

这相当于…

def functional_dict_add( dictionary, key, value ):
   temp = dictionary.copy()
   temp[key] = value
   return temp

c = functional_dict_add( a, 'd', 'dog' )

This popular question addresses functional methods of merging dictionaries a and b.

Here are some of the more straightforward methods (tested in Python 3)…

c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )

Note: The first method above only works if the keys in b are strings.

To add or modify a single element, the b dictionary would contain only that one element…

c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'

This is equivalent to…

def functional_dict_add( dictionary, key, value ):
   temp = dictionary.copy()
   temp[key] = value
   return temp

c = functional_dict_add( a, 'd', 'dog' )

回答 9

假设您想生活在不可变的世界中,不想修改原始文件,而是想创建一个新文件dict,这是向原始文件添加新密钥的结果。

在Python 3.5+中,您可以执行以下操作:

params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}

Python 2等效项是:

params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})

在这两个之后:

params 仍然等于 {'a': 1, 'b': 2}

new_params 等于 {'a': 1, 'b': 2, 'c': 3}

有时候,您不想修改原始文件(您只想要添加到原始文件的结果)。我发现这可以替代以下内容:

params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3

要么

params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})

参考:https : //stackoverflow.com/a/2255892/514866

Let’s pretend you want to live in the immutable world and do NOT want to modify the original but want to create a new dict that is the result of adding a new key to the original.

In Python 3.5+ you can do:

params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}

The Python 2 equivalent is:

params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})

After either of these:

params is still equal to {'a': 1, 'b': 2}

and

new_params is equal to {'a': 1, 'b': 2, 'c': 3}

There will be times when you don’t want to modify the original (you only want the result of adding to the original). I find this a refreshing alternative to the following:

params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3

or

params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})

Reference: https://stackoverflow.com/a/2255892/514866


回答 10

如此众多的答案,仍然让每个人都忘记了这个名字奇怪,举止古怪而又方便的地方 dict.setdefault()

这个

value = my_dict.setdefault(key, default)

基本上就是这样做:

try:
    value = my_dict[key]
except KeyError: # key not found
    value = my_dict[key] = default

例如

>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored

So many answers and still everybody forgot about the strangely named, oddly behaved, and yet still handy dict.setdefault()

This

value = my_dict.setdefault(key, default)

basically just does this:

try:
    value = my_dict[key]
except KeyError: # key not found
    value = my_dict[key] = default

e.g.

>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored

回答 11

如果您不加入两个字典,而是将新的键值对添加到字典中,那么使用下标表示法似乎是最好的方法。

import timeit

timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254

timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383

但是,例如,如果您想添加数千个新的键值对,则应考虑使用该update()方法。

If you’re not joining two dictionaries, but adding new key-value pairs to a dictionary, then using the subscript notation seems like the best way.

import timeit

timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254

timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383

However, if you’d like to add, for example, thousands of new key-value pairs, you should consider using the update() method.


回答 12

我认为collections指出由许多有用的字典子类和包装器组成的Python 模块也是有用的,这些子类和包装器简化了字典中数据类型添加和修改,特别是defaultdict

dict子类,调用工厂函数以提供缺失值

如果要使用始终由相同数据类型或结构组成的字典(例如列表的字典),这将特别有用。

>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})

如果键尚不存在,defaultdict则将给定的值(在我们的例子中10)分配为字典的初始值(通常在循环中使用)。因此,此操作有两件事:将一个新的键添加到字典中(按问题),如果该键尚不存在,则分配值。使用标准字典,这将在+=操作尝试访问尚不存在的值时引发错误:

>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'key'

如果不使用defaultdict,则添加新元素的代码量会大得多,并且可能看起来像这样:

# This type of code would often be inside a loop
if 'key' not in example:
    example['key'] = 0  # add key and initial value to dict; could also be a list
example['key'] += 1  # this is implementing a counter

defaultdict也可以用于复杂的数据类型,例如listset

>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})

添加元素会自动初始化列表。

I think it would also be useful to point out Python’s collections module that consists of many useful dictionary subclasses and wrappers that simplify the addition and modification of data types in a dictionary, specifically defaultdict:

dict subclass that calls a factory function to supply missing values

This is particularly useful if you are working with dictionaries that always consist of the same data types or structures, for example a dictionary of lists.

>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})

If the key does not yet exist, defaultdict assigns the value given (in our case 10) as the initial value to the dictionary (often used inside loops). This operation therefore does two things: it adds a new key to a dictionary (as per question), and assigns the value if the key doesn’t yet exist. With the standard dictionary, this would have raised an error as the += operation is trying to access a value that doesn’t yet exist:

>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'key'

Without the use of defaultdict, the amount of code to add a new element would be much greater and perhaps looks something like:

# This type of code would often be inside a loop
if 'key' not in example:
    example['key'] = 0  # add key and initial value to dict; could also be a list
example['key'] += 1  # this is implementing a counter

defaultdict can also be used with complex data types such as list and set:

>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})

Adding an element automatically initialises the list.


回答 13

这是我在这里没有看到的另一种方式:

>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}

您可以使用字典构造函数和隐式扩展来重建字典。此外,有趣的是,此方法可用于控制字典构建过程中的位置顺序(在Python 3.6之后)。实际上,Python 3.7及更高版本保证了插入顺序!

>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>> 

上面是使用字典理解的。

Here’s another way that I didn’t see here:

>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}

You can use the dictionary constructor and implicit expansion to reconstruct a dictionary. Moreover, interestingly, this method can be used to control the positional order during dictionary construction (post Python 3.6). In fact, insertion order is guaranteed for Python 3.7 and above!

>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>> 

The above is using dictionary comprehension.


回答 14

首先检查密钥是否已经存在

a={1:2,3:4}
a.get(1)
2
a.get(5)
None

然后您可以添加新的键和值

first to check whether the key already exists

a={1:2,3:4}
a.get(1)
2
a.get(5)
None

then you can add the new key and value


回答 15

添加字典键,值类。

class myDict(dict):

    def __init__(self):
        self = dict()

    def add(self, key, value):
        #self[key] = value # add new key and value overwriting any exiting same key
        if self.get(key)!=None:
            print('key', key, 'already used') # report if key already used
        self.setdefault(key, value) # if key exit do nothing


## example

myd = myDict()
name = "fred"

myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)

add dictionary key, value class.

class myDict(dict):

    def __init__(self):
        self = dict()

    def add(self, key, value):
        #self[key] = value # add new key and value overwriting any exiting same key
        if self.get(key)!=None:
            print('key', key, 'already used') # report if key already used
        self.setdefault(key, value) # if key exit do nothing


## example

myd = myDict()
name = "fred"

myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)