标签归档:insert

在Python列表的第一位置插入[关闭]

问题:在Python列表的第一位置插入[关闭]

如何在列表的第一个索引处插入元素?如果我使用list.insert(0,elem),elem是否会修改第一个索引的内容?还是我必须使用第一个元素创建一个新列表,然后将旧列表复制到这个新列表中?

How can I insert an element at the first index of a list ? If I use list.insert(0,elem), do elem modify the content of the first index? Or do I have to create a new list with the first elem and then copy the old list inside this new one?


回答 0

用途insert

In [1]: ls = [1,2,3]

In [2]: ls.insert(0, "new")

In [3]: ls
Out[3]: ['new', 1, 2, 3]

Use insert:

In [1]: ls = [1,2,3]

In [2]: ls.insert(0, "new")

In [3]: ls
Out[3]: ['new', 1, 2, 3]

回答 1

从文档中:

list.insert(i,x)
在给定位置插入项目。第一个参数是要在其之前插入的元素的索引,因此a.insert(0, x)将其插入 到列表的开头,并且a.insert(len(a),x)等效于a.append(x)

http://docs.python.org/2/tutorial/datastructures.html#more-on-lists

From the documentation:

list.insert(i, x)
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a),x) is equivalent to a.append(x)

http://docs.python.org/2/tutorial/datastructures.html#more-on-lists


在熊猫数据框中插入一行

问题:在熊猫数据框中插入一行

我有一个数据框:

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

   A  B  C
0  5  6  7
1  7  8  9

[2 rows x 3 columns]

并且我需要添加第一行[2、3、4]以获取:

   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

我已经尝试过append()concat()起作用,但是找不到正确的方法。

如何在数据框中添加/插入序列?

I have a dataframe:

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

   A  B  C
0  5  6  7
1  7  8  9

[2 rows x 3 columns]

and I need to add a first row [2, 3, 4] to get:

   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

I’ve tried append() and concat() functions but can’t find the right way how to do that.

How to add/insert series to dataframe?


回答 0

只需使用以下命令将行分配给特定索引loc

 df.loc[-1] = [2, 3, 4]  # adding a row
 df.index = df.index + 1  # shifting index
 df = df.sort_index()  # sorting by index

然后,您可以根据需要获得:

    A  B  C
 0  2  3  4
 1  5  6  7
 2  7  8  9

请参阅Pandas文档中的“ 索引:放大设置”

Just assign row to a particular index, using loc:

 df.loc[-1] = [2, 3, 4]  # adding a row
 df.index = df.index + 1  # shifting index
 df = df.sort_index()  # sorting by index

And you get, as desired:

    A  B  C
 0  2  3  4
 1  5  6  7
 2  7  8  9

See in Pandas documentation Indexing: Setting with enlargement.


回答 1

不确定您的调用方式,concat()但是只要两个对象的类型相同,它就可以正常工作。也许问题是您需要将第二个向量转换为数据框?使用您定义的df,以下对我有用:

df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])

Not sure how you were calling concat() but it should work as long as both objects are of the same type. Maybe the issue is that you need to cast your second vector to a dataframe? Using the df that you defined the following works for me:

df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])

回答 2

实现此目的的一种方法是

>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

通常,最简单的方法是附加数据帧,而不是序列。在您的情况下,由于您希望新行位于“顶部”(具有起始ID),并且没有功能pd.prepend(),因此我首先创建新的数据框,然后追加旧的数据框。

ignore_index会忽略数据框中旧的正在进行的索引,并确保第一行实际上以index开头,1而不是以index重启0

典型的免责声明:Cetero censeo …追加行是一种效率很低的操作。如果您关心性能,并且可以某种方式确保首先创建具有正确(较长)索引的数据框,然后仅另一行插入该数据框,则绝对应该这样做。看到:

>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]: 
     A    B    C
0    5    6    7
1    7    8    9
2  NaN  NaN  NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]

到目前为止,我们拥有您所拥有的df

>>> df2
Out[339]: 
     A    B    C
0  NaN  NaN  NaN
1    5    6    7
2    7    8    9

但是现在您可以按如下所示轻松插入该行。由于空间是预先分配的,因此效率更高。

>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

One way to achieve this is

>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

Generally, it’s easiest to append dataframes, not series. In your case, since you want the new row to be “on top” (with starting id), and there is no function pd.prepend(), I first create the new dataframe and then append your old one.

ignore_index will ignore the old ongoing index in your dataframe and ensure that the first row actually starts with index 1 instead of restarting with index 0.

Typical Disclaimer: Cetero censeo … appending rows is a quite inefficient operation. If you care about performance and can somehow ensure to first create a dataframe with the correct (longer) index and then just inserting the additional row into the dataframe, you should definitely do that. See:

>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]: 
     A    B    C
0    5    6    7
1    7    8    9
2  NaN  NaN  NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]

So far, we have what you had as df:

>>> df2
Out[339]: 
     A    B    C
0  NaN  NaN  NaN
1    5    6    7
2    7    8    9

But now you can easily insert the row as follows. Since the space was preallocated, this is more efficient.

>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

回答 3

我整理了一个简短的函数,该函数在插入行时具有更大的灵活性:

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

可以进一步缩短为:

def insert_row(idx, df, df_insert):
    return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)

然后,您可以使用类似:

df = insert_row(2, df, df_new)

这里2是在索引位置df要插入df_new

I put together a short function that allows for a little more flexibility when inserting a row:

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

which could be further shortened to:

def insert_row(idx, df, df_insert):
    return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)

Then you could use something like:

df = insert_row(2, df, df_new)

where 2 is the index position in df where you want to insert df_new.


回答 4

我们可以使用numpy.insert。这具有灵活性的优点。您只需要指定要插入的索引。

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))

    0   1   2
0   2   3   4
1   5   6   7
2   7   8   9

对于np.insert(df.values, 0, values=[2, 3, 4], axis=0),0告诉函数要放置新值的位置/索引。

We can use numpy.insert. This has the advantage of flexibility. You only need to specify the index you want to insert to.

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))

    0   1   2
0   2   3   4
1   5   6   7
2   7   8   9

For np.insert(df.values, 0, values=[2, 3, 4], axis=0), 0 tells the function the place/index you want to place the new values.


回答 5

这看似过于简单,但令人难以置信的是,没有内置简单的插入新行功能。我已经读了很多关于将新df附加到原始df的信息,但是我想知道这样做是否会更快。

df.loc[0] = [row1data, blah...]
i = len(df) + 1
df.loc[i] = [row2data, blah...]

this might seem overly simple but its incredible that a simple insert new row function isn’t built in. i’ve read a lot about appending a new df to the original, but i’m wondering if this would be faster.

df.loc[0] = [row1data, blah...]
i = len(df) + 1
df.loc[i] = [row2data, blah...]

回答 6

以下是在不排序和重置索引的情况下将行插入pandas数据框的最佳方法:

import pandas as pd

df = pd.DataFrame(columns=['a','b','c'])

def insert(df, row):
    insert_loc = df.index.max()

    if pd.isna(insert_loc):
        df.loc[0] = row
    else:
        df.loc[insert_loc + 1] = row

insert(df,[2,3,4])
insert(df,[8,9,0])
print(df)

Below would be the best way to insert a row into pandas dataframe without sorting and reseting an index:

import pandas as pd

df = pd.DataFrame(columns=['a','b','c'])

def insert(df, row):
    insert_loc = df.index.max()

    if pd.isna(insert_loc):
        df.loc[0] = row
    else:
        df.loc[insert_loc + 1] = row

insert(df,[2,3,4])
insert(df,[8,9,0])
print(df)

回答 7

concat()似乎比最后一行插入和重新索引要快一点。如果有人想知道两种主要方法的速度:

In [x]: %%timeit
     ...: df = pd.DataFrame(columns=['a','b'])
     ...: for i in range(10000):
     ...:     df.loc[-1] = [1,2]
     ...:     df.index = df.index + 1
     ...:     df = df.sort_index()

每个循环17.1 s±705毫秒(平均±标准偏差,共7次运行,每个循环1次)

In [y]: %%timeit
     ...: df = pd.DataFrame(columns=['a', 'b'])
     ...: for i in range(10000):
     ...:     df = pd.concat([pd.DataFrame([[1,2]], columns=df.columns), df])

每个循环6.53 s±127毫秒(平均±标准偏差,共7次运行,每个循环1次)

concat() seems to be a bit faster than last row insertion and reindexing. In case someone would wonder about the speed of two top approaches:

In [x]: %%timeit
     ...: df = pd.DataFrame(columns=['a','b'])
     ...: for i in range(10000):
     ...:     df.loc[-1] = [1,2]
     ...:     df.index = df.index + 1
     ...:     df = df.sort_index()

17.1 s ± 705 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [y]: %%timeit
     ...: df = pd.DataFrame(columns=['a', 'b'])
     ...: for i in range(10000):
     ...:     df = pd.concat([pd.DataFrame([[1,2]], columns=df.columns), df])

6.53 s ± 127 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


回答 8

在pandas中添加一行很简单DataFrame

  1. 创建一个与您的列名称相同的常规Python字典Dataframe

  2. 使用pandas.append()method并传入您的字典名称,其中.append()DataFrame实例上的方法是;

  3. ignore_index=True在您的词典名称之后添加。

It is pretty simple to add a row into a pandas DataFrame:

  1. Create a regular Python dictionary with the same columns names as your Dataframe;

  2. Use pandas.append() method and pass in the name of your dictionary, where .append() is a method on DataFrame instances;

  3. Add ignore_index=True right after your dictionary name.


回答 9

您可以简单地将行追加到DataFrame的末尾,然后调整索引。

例如:

df = df.append(pd.DataFrame([[2,3,4]],columns=df.columns),ignore_index=True)
df.index = (df.index + 1) % len(df)
df = df.sort_index()

concat用作:

df = pd.concat([pd.DataFrame([[1,2,3,4,5,6]],columns=df.columns),df],ignore_index=True)

You can simply append the row to the end of the DataFrame, and then adjust the index.

For instance:

df = df.append(pd.DataFrame([[2,3,4]],columns=df.columns),ignore_index=True)
df.index = (df.index + 1) % len(df)
df = df.sort_index()

Or use concat as:

df = pd.concat([pd.DataFrame([[1,2,3,4,5,6]],columns=df.columns),df],ignore_index=True)

回答 10

在熊猫数据框中添加一行的最简单方法是:

DataFrame.loc[ location of insertion ]= list( )

范例:

DF.loc[ 9 ] = [ ´Pepe , 33, ´Japan ]

注意:列表的长度应与数据框的长度匹配。

The simplest way add a row in a pandas data frame is:

DataFrame.loc[ location of insertion ]= list( )

Example :

DF.loc[ 9 ] = [ ´Pepe’ , 33, ´Japan’ ]

NB: the length of your list should match that of the data frame.


Python Pandas将列表插入单元格

问题:Python Pandas将列表插入单元格

我有一个列表“ abc”和一个数据框“ df”:

abc = ['foo', 'bar']
df =
    A  B
0  12  NaN
1  23  NaN

我想将列表插入单元格1B中,所以我想要这个结果:

    A  B
0  12  NaN
1  23  ['foo', 'bar']

我能做到吗?

1)如果我使用这个:

df.ix[1,'B'] = abc

我收到以下错误消息:

ValueError: Must have equal len keys and value when setting with an iterable

因为它尝试将列表(具有两个元素)插入行/列而不插入单元格。

2)如果我使用这个:

df.ix[1,'B'] = [abc]

然后插入一个只有一个元素的列表,即“ abc”列表( [['foo', 'bar']])。

3)如果我使用这个:

df.ix[1,'B'] = ', '.join(abc)

然后插入一个字符串:( foo, bar),但不插入列表。

4)如果我使用这个:

df.ix[1,'B'] = [', '.join(abc)]

然后插入一个列表,但只有一个元素(['foo, bar']),但没有两个我想要的元素(['foo', 'bar'])。

感谢帮助!


编辑

我的新数据框和旧列表:

abc = ['foo', 'bar']
df2 =
    A    B         C
0  12  NaN      'bla'
1  23  NaN  'bla bla'

另一个数据框:

df3 =
    A    B         C                    D
0  12  NaN      'bla'  ['item1', 'item2']
1  23  NaN  'bla bla'        [11, 12, 13]

我想将“ abc”列表插入df2.loc[1,'B']和/或df3.loc[1,'B']

如果数据框仅包含具有整数值和/或NaN值和/或列表值的列,则将列表插入到单元格中的效果很好。如果数据框仅包含具有字符串值和/或NaN值和/或列表值的列,则将列表插入到单元格中的效果很好。但是,如果数据框具有包含整数和字符串值的列以及其他列,那么如果我使用此错误消息,则会出现错误消息:df2.loc[1,'B'] = abcdf3.loc[1,'B'] = abc

另一个数据框:

df4 =
          A     B
0      'bla'  NaN
1  'bla bla'  NaN

这些插入可以完美地工作:df.loc[1,'B'] = abcdf4.loc[1,'B'] = abc

I have a list ‘abc’ and a dataframe ‘df’:

abc = ['foo', 'bar']
df =
    A  B
0  12  NaN
1  23  NaN

I want to insert the list into cell 1B, so I want this result:

    A  B
0  12  NaN
1  23  ['foo', 'bar']

Ho can I do that?

1) If I use this:

df.ix[1,'B'] = abc

I get the following error message:

ValueError: Must have equal len keys and value when setting with an iterable

because it tries to insert the list (that has two elements) into a row / column but not into a cell.

2) If I use this:

df.ix[1,'B'] = [abc]

then it inserts a list that has only one element that is the ‘abc’ list ( [['foo', 'bar']] ).

3) If I use this:

df.ix[1,'B'] = ', '.join(abc)

then it inserts a string: ( foo, bar ) but not a list.

4) If I use this:

df.ix[1,'B'] = [', '.join(abc)]

then it inserts a list but it has only one element ( ['foo, bar'] ) but not two as I want ( ['foo', 'bar'] ).

Thanks for help!


EDIT

My new dataframe and the old list:

abc = ['foo', 'bar']
df2 =
    A    B         C
0  12  NaN      'bla'
1  23  NaN  'bla bla'

Another dataframe:

df3 =
    A    B         C                    D
0  12  NaN      'bla'  ['item1', 'item2']
1  23  NaN  'bla bla'        [11, 12, 13]

I want insert the ‘abc’ list into df2.loc[1,'B'] and/or df3.loc[1,'B'].

If the dataframe has columns only with integer values and/or NaN values and/or list values then inserting a list into a cell works perfectly. If the dataframe has columns only with string values and/or NaN values and/or list values then inserting a list into a cell works perfectly. But if the dataframe has columns with integer and string values and other columns then the error message appears if I use this: df2.loc[1,'B'] = abc or df3.loc[1,'B'] = abc.

Another dataframe:

df4 =
          A     B
0      'bla'  NaN
1  'bla bla'  NaN

These inserts work perfectly: df.loc[1,'B'] = abc or df4.loc[1,'B'] = abc.


回答 0

由于自0.21.0版set_value开始不推荐使用,因此您现在应该使用at。它可以插入一个列表的小区没有抚养ValueErrorloc一样。我认为这是因为at 总是引用单个值,而loc可以引用值以及行和列。

df = pd.DataFrame(data={'A': [1, 2, 3], 'B': ['x', 'y', 'z']})

df.at[1, 'B'] = ['m', 'n']

df =
    A   B
0   1   x
1   2   [m, n]
2   3   z

您还需要确保要插入的具有dtype=object。例如

>>> df = pd.DataFrame(data={'A': [1, 2, 3], 'B': [1,2,3]})
>>> df.dtypes
A    int64
B    int64
dtype: object

>>> df.at[1, 'B'] = [1, 2, 3]
ValueError: setting an array element with a sequence

>>> df['B'] = df['B'].astype('object')
>>> df.at[1, 'B'] = [1, 2, 3]
>>> df
   A          B
0  1          1
1  2  [1, 2, 3]
2  3          3

Since set_value has been deprecated since version 0.21.0, you should now use at. It can insert a list into a cell without raising a ValueError as loc does. I think this is because at always refers to a single value, while loc can refer to values as well as rows and columns.

df = pd.DataFrame(data={'A': [1, 2, 3], 'B': ['x', 'y', 'z']})

df.at[1, 'B'] = ['m', 'n']

df =
    A   B
0   1   x
1   2   [m, n]
2   3   z

You also need to make sure the column you are inserting into has dtype=object. For example

>>> df = pd.DataFrame(data={'A': [1, 2, 3], 'B': [1,2,3]})
>>> df.dtypes
A    int64
B    int64
dtype: object

>>> df.at[1, 'B'] = [1, 2, 3]
ValueError: setting an array element with a sequence

>>> df['B'] = df['B'].astype('object')
>>> df.at[1, 'B'] = [1, 2, 3]
>>> df
   A          B
0  1          1
1  2  [1, 2, 3]
2  3          3

回答 1

df3.set_value(1, 'B', abc)适用于任何数据框。注意列“ B”的数据类型。例如。不能将列表插入浮点列,在这种情况下df['B'] = df['B'].astype(object)可以提供帮助。

df3.set_value(1, 'B', abc) works for any dataframe. Take care of the data type of column ‘B’. Eg. a list can not be inserted into a float column, at that case df['B'] = df['B'].astype(object) can help.


回答 2

熊猫> = 0.21

set_value已不推荐使用。 现在,您可以使用DataFrame.at按标签DataFrame.iat设置和按整数位置设置。

使用at/ 设置单元格值iat

# Setup
df = pd.DataFrame({'A': [12, 23], 'B': [['a', 'b'], ['c', 'd']]})
df

    A       B
0  12  [a, b]
1  23  [c, d]

df.dtypes

A     int64
B    object
dtype: object

如果要将“ B”第二行中的值设置为一些新列表,请使用DataFrane.at

df.at[1, 'B'] = ['m', 'n']
df

    A       B
0  12  [a, b]
1  23  [m, n]

您也可以使用 DataFrame.iat

df.iat[1, df.columns.get_loc('B')] = ['m', 'n']
df

    A       B
0  12  [a, b]
1  23  [m, n]

如果得到了ValueError: setting an array element with a sequence怎么办?

我将尝试通过以下方式重现该内容:

df

    A   B
0  12 NaN
1  23 NaN

df.dtypes

A      int64
B    float64
dtype: object

df.at[1, 'B'] = ['m', 'n']
# ValueError: setting an array element with a sequence.

这是因为您的对象是float64dtype,而列表是objects,所以那里不匹配。在这种情况下,您要做的是先将列转换为对象。

df['B'] = df['B'].astype(object)
df.dtypes

A     int64
B    object
dtype: object

然后,它起作用:

df.at[1, 'B'] = ['m', 'n']
df

    A       B
0  12     NaN
1  23  [m, n]

可能,但是哈基

更古怪的是,我发现DataFrame.loc如果传递嵌套列表,您可以破解以实现相似的目的。

df.loc[1, 'B'] = [['m'], ['n'], ['o'], ['p']]
df

    A             B
0  12        [a, b]
1  23  [m, n, o, p]

您可以在这里阅读更多有关其工作原理的信息。

Pandas >= 0.21

set_value has been deprecated. You can now use DataFrame.at to set by label, and DataFrame.iat to set by integer position.

Setting Cell Values with at/iat

# Setup
df = pd.DataFrame({'A': [12, 23], 'B': [['a', 'b'], ['c', 'd']]})
df

    A       B
0  12  [a, b]
1  23  [c, d]

df.dtypes

A     int64
B    object
dtype: object

If you want to set a value in second row of the “B” to some new list, use DataFrane.at:

df.at[1, 'B'] = ['m', 'n']
df

    A       B
0  12  [a, b]
1  23  [m, n]

You can also set by integer position using DataFrame.iat

df.iat[1, df.columns.get_loc('B')] = ['m', 'n']
df

    A       B
0  12  [a, b]
1  23  [m, n]

What if I get ValueError: setting an array element with a sequence?

I’ll try to reproduce this with:

df

    A   B
0  12 NaN
1  23 NaN

df.dtypes

A      int64
B    float64
dtype: object

df.at[1, 'B'] = ['m', 'n']
# ValueError: setting an array element with a sequence.

This is because of a your object is of float64 dtype, whereas lists are objects, so there’s a mismatch there. What you would have to do in this situation is to convert the column to object first.

df['B'] = df['B'].astype(object)
df.dtypes

A     int64
B    object
dtype: object

Then, it works:

df.at[1, 'B'] = ['m', 'n']
df

    A       B
0  12     NaN
1  23  [m, n]

Possible, But Hacky

Even more wacky, I’ve found you can hack through DataFrame.loc to achieve something similar if you pass nested lists.

df.loc[1, 'B'] = [['m'], ['n'], ['o'], ['p']]
df

    A             B
0  12        [a, b]
1  23  [m, n, o, p]

You can read more about why this works here.


回答 3

如本篇文章中提到的熊猫:如何在数据框中存储列表?; 数据帧中的dtype可能会影响结果,以及调用数据帧或不将其分配给它。

As mentionned in this post pandas: how to store a list in a dataframe?; the dtypes in the dataframe may influence the results, as well as calling a dataframe or not to be assigned to.


回答 4

快速解决

只需将列表括在新列表中,就像在下面的数据框中对col2所做的那样。它起作用的原因是python获取(列表的)外部列表,并将其转换为列,就好像它包含普通标量项目一样,在我们的例子中是列表,而不是普通标量。

mydict={'col1':[1,2,3],'col2':[[1, 4], [2, 5], [3, 6]]}
data=pd.DataFrame(mydict)
data


   col1     col2
0   1       [1, 4]
1   2       [2, 5]
2   3       [3, 6]

Quick work around

Simply enclose the list within a new list, as done for col2 in the data frame below. The reason it works is that python takes the outer list (of lists) and converts it into a column as if it were containing normal scalar items, which is lists in our case and not normal scalars.

mydict={'col1':[1,2,3],'col2':[[1, 4], [2, 5], [3, 6]]}
data=pd.DataFrame(mydict)
data


   col1     col2
0   1       [1, 4]
1   2       [2, 5]
2   3       [3, 6]

回答 5

也得到

ValueError: Must have equal len keys and value when setting with an iterable

在我的情况下,使用.at而不是.loc并没有任何区别,但是强制使用dataframe列的数据类型可以解决问题:

df['B'] = df['B'].astype(object)

然后,我可以将列表,numpy数组和所有类型的东西设置为数据帧中的单个单元格值。

Also getting

ValueError: Must have equal len keys and value when setting with an iterable,

using .at rather than .loc did not make any difference in my case, but enforcing the datatype of the dataframe column did the trick:

df['B'] = df['B'].astype(object)

Then I could set lists, numpy array and all sorts of things as single cell values in my dataframes.


在列表中的特定索引处插入元素,然后返回更新后的列表

问题:在列表中的特定索引处插入元素,然后返回更新后的列表

我有这个:

>>> a = [1, 2, 4]
>>> print a
[1, 2, 4]

>>> print a.insert(2, 3)
None

>>> print a
[1, 2, 3, 4]

>>> b = a.insert(3, 6)
>>> print b
None

>>> print a
[1, 2, 3, 6, 4]

有没有一种方法可以获取更新的列表作为结果,而不是就地更新原始列表?

I have this:

>>> a = [1, 2, 4]
>>> print a
[1, 2, 4]

>>> print a.insert(2, 3)
None

>>> print a
[1, 2, 3, 4]

>>> b = a.insert(3, 6)
>>> print b
None

>>> print a
[1, 2, 3, 6, 4]

Is there a way I can get the updated list as the result, instead of updating the original list in place?


回答 0

l.insert(index, obj)实际上不返回任何东西。它只是更新列表。

正如ATO所说,您可以做到b = a[:index] + [obj] + a[index:]。但是,另一种方法是:

a = [1, 2, 4]
b = a[:]
b.insert(2, 3)

l.insert(index, obj) doesn’t actually return anything. It just updates the list.

As ATO said, you can do b = a[:index] + [obj] + a[index:]. However, another way is:

a = [1, 2, 4]
b = a[:]
b.insert(2, 3)

回答 1

最高效的方法

您也可以使用列表中的切片索引插入元素。例如:

>>> a = [1, 2, 4]
>>> insert_at = 2  # Index at which you want to insert item

>>> b = a[:]   # Created copy of list "a" as "b".
               # Skip this step if you are ok with modifying the original list

>>> b[insert_at:insert_at] = [3]  # Insert "3" within "b"
>>> b
[1, 2, 3, 4]

在给定索引处将多个元素一起插入,您要做的就是使用list要插入的多个元素中的一个。例如:

>>> a = [1, 2, 4]
>>> insert_at = 2   # Index starting from which multiple elements will be inserted

# List of elements that you want to insert together at "index_at" (above) position
>>> insert_elements = [3, 5, 6]

>>> a[insert_at:insert_at] = insert_elements
>>> a   # [3, 5, 6] are inserted together in `a` starting at index "2"
[1, 2, 3, 5, 6, 4]

使用列表理解的替代方法 (但性能很慢)

作为替代方案,它可以使用来实现清单理解enumerate过。(但是请不要这样做。这只是为了说明)

>>> a = [1, 2, 4]
>>> insert_at = 2

>>> b = [y for i, x in enumerate(a) for y in ((3, x) if i == insert_at else (x, ))]
>>> b
[1, 2, 3, 4]

所有解决方案的性能比较

以下timeit是所有答案与Python 3.4.5的1000个元素列表的比较:

  • 使用切片插入的地雷解答 -最快(每个循环3.08微秒)

     mquadri$ python3 -m timeit -s "a = list(range(1000))" "b = a[:]; b[500:500] = [3]"
     100000 loops, best of 3: 3.08 µsec per loop
  • ATOzTOA接受的基于切片列表合并的答案 -秒(每个循环6.71微秒)

     mquadri$ python3 -m timeit -s "a = list(range(1000))" "b = a[:500] + [3] + a[500:]"
     100000 loops, best of 3: 6.71 µsec per loop
  • 鲁希·潘查尔(Rushy Panchal)的票数最多,答案list.insert(...)-第三(每个循环26.5 微秒

     python3 -m timeit -s "a = list(range(1000))" "b = a[:]; b.insert(500, 3)"
     10000 loops, best of 3: 26.5 µsec per loop
  • 我的回答列表理解enumerate四- (每圈168微秒很慢)

     mquadri$ python3 -m timeit -s "a = list(range(1000))" "[y for i, x in enumerate(a) for y in ((3, x) if i == 500 else (x, )) ]"
     10000 loops, best of 3: 168 µsec per loop

Most performance efficient approach

You may also insert the element using the slice indexing in the list. For example:

>>> a = [1, 2, 4]
>>> insert_at = 2  # Index at which you want to insert item

>>> b = a[:]   # Created copy of list "a" as "b".
               # Skip this step if you are ok with modifying the original list

>>> b[insert_at:insert_at] = [3]  # Insert "3" within "b"
>>> b
[1, 2, 3, 4]

For inserting multiple elements together at a given index, all you need to do is to use a list of multiple elements that you want to insert. For example:

>>> a = [1, 2, 4]
>>> insert_at = 2   # Index starting from which multiple elements will be inserted

# List of elements that you want to insert together at "index_at" (above) position
>>> insert_elements = [3, 5, 6]

>>> a[insert_at:insert_at] = insert_elements
>>> a   # [3, 5, 6] are inserted together in `a` starting at index "2"
[1, 2, 3, 5, 6, 4]

Alternative using list comprehension (but very slow in terms of performance):

As an alternative, it can be achieved using list comprehension with enumerate too. (But please don’t do it this way. It is just for illustration):

>>> a = [1, 2, 4]
>>> insert_at = 2

>>> b = [y for i, x in enumerate(a) for y in ((3, x) if i == insert_at else (x, ))]
>>> b
[1, 2, 3, 4]

Performance comparison of all solutions

Here’s the timeit comparison of all the answers with list of 1000 elements for Python 3.4.5:

  • Mine answer using sliced insertion – Fastest (3.08 µsec per loop)

     mquadri$ python3 -m timeit -s "a = list(range(1000))" "b = a[:]; b[500:500] = [3]"
     100000 loops, best of 3: 3.08 µsec per loop
    
  • ATOzTOA’s accepted answer based on merge of sliced lists – Second (6.71 µsec per loop)

     mquadri$ python3 -m timeit -s "a = list(range(1000))" "b = a[:500] + [3] + a[500:]"
     100000 loops, best of 3: 6.71 µsec per loop
    
  • Rushy Panchal’s answer with most votes using list.insert(...)– Third (26.5 usec per loop)

     python3 -m timeit -s "a = list(range(1000))" "b = a[:]; b.insert(500, 3)"
     10000 loops, best of 3: 26.5 µsec per loop
    
  • My answer with List Comprehension and enumerate – Fourth (very slow with 168 µsec per loop)

     mquadri$ python3 -m timeit -s "a = list(range(1000))" "[y for i, x in enumerate(a) for y in ((3, x) if i == 500 else (x, )) ]"
     10000 loops, best of 3: 168 µsec per loop
    

回答 2

我得到的最短信息: b = a[:2] + [3] + a[2:]

>>>
>>> a = [1, 2, 4]
>>> print a
[1, 2, 4]
>>> b = a[:2] + [3] + a[2:]
>>> print a
[1, 2, 4]
>>> print b
[1, 2, 3, 4]

The shortest I got: b = a[:2] + [3] + a[2:]

>>>
>>> a = [1, 2, 4]
>>> print a
[1, 2, 4]
>>> b = a[:2] + [3] + a[2:]
>>> print a
[1, 2, 4]
>>> print b
[1, 2, 3, 4]

回答 3

最干净的方法是复制列表,然后将对象插入副本。在Python 3上,可以通过list.copy以下方式完成:

new = old.copy()
new.insert(index, value)

在Python 2上,可以通过new = old[:](通过python 3也可以)复制列表。

在性能方面,与其他建议的方法没有区别:

$ python --version
Python 3.8.1
$ python -m timeit -s "a = list(range(1000))" "b = a.copy(); b.insert(500, 3)"
100000 loops, best of 5: 2.84 µsec per loop
$ python -m timeit -s "a = list(range(1000))" "b = a.copy(); b[500:500] = (3,)"
100000 loops, best of 5: 2.76 µsec per loop

The cleanest approach is to copy the list and then insert the object into the copy. On Python 3 this can be done via list.copy:

new = old.copy()
new.insert(index, value)

On Python 2 copying the list can be achieved via new = old[:] (this also works on Python 3).

In terms of performance there is no difference to other proposed methods:

$ python --version
Python 3.8.1
$ python -m timeit -s "a = list(range(1000))" "b = a.copy(); b.insert(500, 3)"
100000 loops, best of 5: 2.84 µsec per loop
$ python -m timeit -s "a = list(range(1000))" "b = a.copy(); b[500:500] = (3,)"
100000 loops, best of 5: 2.76 µsec per loop

回答 4

使用Python列表insert()方法。用法:

#句法

insert()方法的语法-

list.insert(index, obj)

#参数

  • index-这是需要在其中插入对象obj的索引。
  • obj-这是要插入给定列表的对象。

#Return Value此方法不返回任何值,但会将给定元素插入给定索引。

例:

a = [1,2,4,5]

a.insert(2,3)

print(a)

退货 [1, 2, 3, 4, 5]

Use the Python list insert() method. Usage:

#Syntax

The syntax for the insert() method −

list.insert(index, obj)

#Parameters

  • index − This is the Index where the object obj need to be inserted.
  • obj − This is the Object to be inserted into the given list.

#Return Value This method does not return any value, but it inserts the given element at the given index.

Example:

a = [1,2,4,5]

a.insert(2,3)

print(a)

Returns [1, 2, 3, 4, 5]