Python 实用宝典

Question 1

How can I insert an element at the first index of a list ? If I use list.insert(0,elem), do elem modify the content of the first index? Or do I have to create a new list with the first elem and then copy the old list inside this new one?

Question 2

Use insert:

In [1]: ls = [1,2,3]

In [2]: ls.insert(0, "new")

In [3]: ls
Out[3]: ['new', 1, 2, 3]

Question 3

From the documentation:

list.insert(i, x)
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a),x) is equivalent to a.append(x)

http://docs.python.org/2/tutorial/datastructures.html#more-on-lists

Question 4

I have a dataframe:

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

   A  B  C
0  5  6  7
1  7  8  9

[2 rows x 3 columns]

and I need to add a first row [2, 3, 4] to get:

I’ve tried append() and concat() functions but can’t find the right way how to do that.

How to add/insert series to dataframe?

Question 5

Just assign row to a particular index, using loc:

 df.loc[-1] = [2, 3, 4]  # adding a row
 df.index = df.index + 1  # shifting index
 df = df.sort_index()  # sorting by index

And you get, as desired:

See in Pandas documentation Indexing: Setting with enlargement.

Question 6

Not sure how you were calling concat() but it should work as long as both objects are of the same type. Maybe the issue is that you need to cast your second vector to a dataframe? Using the df that you defined the following works for me:

df2 = pd.DataFrame([[2,3,4]], columns=['A','B','C'])
pd.concat([df2, df])

Question 7

One way to achieve this is

>>> pd.DataFrame(np.array([[2, 3, 4]]), columns=['A', 'B', 'C']).append(df, ignore_index=True)
Out[330]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

Generally, it’s easiest to append dataframes, not series. In your case, since you want the new row to be “on top” (with starting id), and there is no function pd.prepend(), I first create the new dataframe and then append your old one.

ignore_index will ignore the old ongoing index in your dataframe and ensure that the first row actually starts with index 1 instead of restarting with index 0.

Typical Disclaimer: Cetero censeo … appending rows is a quite inefficient operation. If you care about performance and can somehow ensure to first create a dataframe with the correct (longer) index and then just inserting the additional row into the dataframe, you should definitely do that. See:

>>> index = np.array([0, 1, 2])
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[0:1] = [list(s1), list(s2)]
>>> df2
Out[336]: 
     A    B    C
0    5    6    7
1    7    8    9
2  NaN  NaN  NaN
>>> df2 = pd.DataFrame(columns=['A', 'B', 'C'], index=index)
>>> df2.loc[1:] = [list(s1), list(s2)]

So far, we have what you had as df:

>>> df2
Out[339]: 
     A    B    C
0  NaN  NaN  NaN
1    5    6    7
2    7    8    9

But now you can easily insert the row as follows. Since the space was preallocated, this is more efficient.

>>> df2.loc[0] = np.array([2, 3, 4])
>>> df2
Out[341]: 
   A  B  C
0  2  3  4
1  5  6  7
2  7  8  9

Question 8

I put together a short function that allows for a little more flexibility when inserting a row:

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

which could be further shortened to:

def insert_row(idx, df, df_insert):
    return df.iloc[:idx, ].append(df_insert).append(df.iloc[idx:, ]).reset_index(drop = True)

Then you could use something like:

df = insert_row(2, df, df_new)

where 2 is the index position in df where you want to insert df_new.

Question 9

We can use numpy.insert. This has the advantage of flexibility. You only need to specify the index you want to insert to.

s1 = pd.Series([5, 6, 7])
s2 = pd.Series([7, 8, 9])

df = pd.DataFrame([list(s1), list(s2)],  columns =  ["A", "B", "C"])

pd.DataFrame(np.insert(df.values, 0, values=[2, 3, 4], axis=0))

    0   1   2
0   2   3   4
1   5   6   7
2   7   8   9

For np.insert(df.values, 0, values=[2, 3, 4], axis=0), 0 tells the function the place/index you want to place the new values.

Question 10

this might seem overly simple but its incredible that a simple insert new row function isn’t built in. i’ve read a lot about appending a new df to the original, but i’m wondering if this would be faster.

df.loc[0] = [row1data, blah...]
i = len(df) + 1
df.loc[i] = [row2data, blah...]

Question 11

Below would be the best way to insert a row into pandas dataframe without sorting and reseting an index:

import pandas as pd

df = pd.DataFrame(columns=['a','b','c'])

def insert(df, row):
    insert_loc = df.index.max()

    if pd.isna(insert_loc):
        df.loc[0] = row
    else:
        df.loc[insert_loc + 1] = row

insert(df,[2,3,4])
insert(df,[8,9,0])
print(df)

Question 12

concat() seems to be a bit faster than last row insertion and reindexing. In case someone would wonder about the speed of two top approaches:

In [x]: %%timeit
     ...: df = pd.DataFrame(columns=['a','b'])
     ...: for i in range(10000):
     ...:     df.loc[-1] = [1,2]
     ...:     df.index = df.index + 1
     ...:     df = df.sort_index()

17.1 s ± 705 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [y]: %%timeit
     ...: df = pd.DataFrame(columns=['a', 'b'])
     ...: for i in range(10000):
     ...:     df = pd.concat([pd.DataFrame([[1,2]], columns=df.columns), df])

6.53 s ± 127 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Question 13

It is pretty simple to add a row into a pandas DataFrame:

Create a regular Python dictionary with the same columns names as your Dataframe;
Use pandas.append() method and pass in the name of your dictionary, where .append() is a method on DataFrame instances;
Add ignore_index=True right after your dictionary name.

Question 14

You can simply append the row to the end of the DataFrame, and then adjust the index.

For instance:

df = df.append(pd.DataFrame([[2,3,4]],columns=df.columns),ignore_index=True)
df.index = (df.index + 1) % len(df)
df = df.sort_index()

Or use concat as:

df = pd.concat([pd.DataFrame([[1,2,3,4,5,6]],columns=df.columns),df],ignore_index=True)

Question 15

The simplest way add a row in a pandas data frame is:

DataFrame.loc[ location of insertion ]= list( )

Example :

DF.loc[ 9 ] = [ ´Pepe’ , 33, ´Japan’ ]

NB: the length of your list should match that of the data frame.

Question 16

I have a list ‘abc’ and a dataframe ‘df’:

abc = ['foo', 'bar']
df =
    A  B
0  12  NaN
1  23  NaN

I want to insert the list into cell 1B, so I want this result:

    A  B
0  12  NaN
1  23  ['foo', 'bar']

Ho can I do that?

1) If I use this:

df.ix[1,'B'] = abc

I get the following error message:

ValueError: Must have equal len keys and value when setting with an iterable

because it tries to insert the list (that has two elements) into a row / column but not into a cell.

2) If I use this:

df.ix[1,'B'] = [abc]

then it inserts a list that has only one element that is the ‘abc’ list ( [['foo', 'bar']] ).

3) If I use this:

df.ix[1,'B'] = ', '.join(abc)

then it inserts a string: ( foo, bar ) but not a list.

4) If I use this:

df.ix[1,'B'] = [', '.join(abc)]

then it inserts a list but it has only one element ( ['foo, bar'] ) but not two as I want ( ['foo', 'bar'] ).

Thanks for help!

EDIT

My new dataframe and the old list:

abc = ['foo', 'bar']
df2 =
    A    B         C
0  12  NaN      'bla'
1  23  NaN  'bla bla'

Another dataframe:

df3 =
    A    B         C                    D
0  12  NaN      'bla'  ['item1', 'item2']
1  23  NaN  'bla bla'        [11, 12, 13]

I want insert the ‘abc’ list into df2.loc[1,'B'] and/or df3.loc[1,'B'].

If the dataframe has columns only with integer values and/or NaN values and/or list values then inserting a list into a cell works perfectly. If the dataframe has columns only with string values and/or NaN values and/or list values then inserting a list into a cell works perfectly. But if the dataframe has columns with integer and string values and other columns then the error message appears if I use this: df2.loc[1,'B'] = abc or df3.loc[1,'B'] = abc.

Another dataframe:

df4 =
          A     B
0      'bla'  NaN
1  'bla bla'  NaN

These inserts work perfectly: df.loc[1,'B'] = abc or df4.loc[1,'B'] = abc.

Question 17

Since set_value has been deprecated since version 0.21.0, you should now use at. It can insert a list into a cell without raising a ValueError as loc does. I think this is because at always refers to a single value, while loc can refer to values as well as rows and columns.

df = pd.DataFrame(data={'A': [1, 2, 3], 'B': ['x', 'y', 'z']})

df.at[1, 'B'] = ['m', 'n']

df =
    A   B
0   1   x
1   2   [m, n]
2   3   z

You also need to make sure the column you are inserting into has dtype=object. For example

>>> df = pd.DataFrame(data={'A': [1, 2, 3], 'B': [1,2,3]})
>>> df.dtypes
A    int64
B    int64
dtype: object

>>> df.at[1, 'B'] = [1, 2, 3]
ValueError: setting an array element with a sequence

>>> df['B'] = df['B'].astype('object')
>>> df.at[1, 'B'] = [1, 2, 3]
>>> df
   A          B
0  1          1
1  2  [1, 2, 3]
2  3          3

Question 18

df3.set_value(1, 'B', abc) works for any dataframe. Take care of the data type of column ‘B’. Eg. a list can not be inserted into a float column, at that case df['B'] = df['B'].astype(object) can help.

Question 19

Pandas >= 0.21

set_value has been deprecated. You can now use DataFrame.at to set by label, and DataFrame.iat to set by integer position.

Setting Cell Values with `at`/`iat`

# Setup
df = pd.DataFrame({'A': [12, 23], 'B': [['a', 'b'], ['c', 'd']]})
df

    A       B
0  12  [a, b]
1  23  [c, d]

df.dtypes

A     int64
B    object
dtype: object

If you want to set a value in second row of the “B” to some new list, use DataFrane.at:

df.at[1, 'B'] = ['m', 'n']
df

    A       B
0  12  [a, b]
1  23  [m, n]

You can also set by integer position using DataFrame.iat

df.iat[1, df.columns.get_loc('B')] = ['m', 'n']
df

    A       B
0  12  [a, b]
1  23  [m, n]

What if I get `ValueError: setting an array element with a sequence`?

I’ll try to reproduce this with:

df

    A   B
0  12 NaN
1  23 NaN

df.dtypes

A      int64
B    float64
dtype: object

df.at[1, 'B'] = ['m', 'n']
# ValueError: setting an array element with a sequence.

This is because of a your object is of float64 dtype, whereas lists are objects, so there’s a mismatch there. What you would have to do in this situation is to convert the column to object first.

df['B'] = df['B'].astype(object)
df.dtypes

A     int64
B    object
dtype: object

Then, it works:

df.at[1, 'B'] = ['m', 'n']
df

    A       B
0  12     NaN
1  23  [m, n]

Possible, But Hacky

Even more wacky, I’ve found you can hack through DataFrame.loc to achieve something similar if you pass nested lists.

df.loc[1, 'B'] = [['m'], ['n'], ['o'], ['p']]
df

    A             B
0  12        [a, b]
1  23  [m, n, o, p]

You can read more about why this works here.

Question 20

As mentionned in this post pandas: how to store a list in a dataframe?; the dtypes in the dataframe may influence the results, as well as calling a dataframe or not to be assigned to.

Question 21

Quick work around

Simply enclose the list within a new list, as done for col2 in the data frame below. The reason it works is that python takes the outer list (of lists) and converts it into a column as if it were containing normal scalar items, which is lists in our case and not normal scalars.

mydict={'col1':[1,2,3],'col2':[[1, 4], [2, 5], [3, 6]]}
data=pd.DataFrame(mydict)
data


   col1     col2
0   1       [1, 4]
1   2       [2, 5]
2   3       [3, 6]

Question 22

Also getting

ValueError: Must have equal len keys and value when setting with an iterable,

using .at rather than .loc did not make any difference in my case, but enforcing the datatype of the dataframe column did the trick:

df['B'] = df['B'].astype(object)

Then I could set lists, numpy array and all sorts of things as single cell values in my dataframes.

Question 23

I have this:

>>> a = [1, 2, 4]
>>> print a
[1, 2, 4]

>>> print a.insert(2, 3)
None

>>> print a
[1, 2, 3, 4]

>>> b = a.insert(3, 6)
>>> print b
None

>>> print a
[1, 2, 3, 6, 4]

Is there a way I can get the updated list as the result, instead of updating the original list in place?

Question 24

l.insert(index, obj) doesn’t actually return anything. It just updates the list.

As ATO said, you can do b = a[:index] + [obj] + a[index:]. However, another way is:

a = [1, 2, 4]
b = a[:]
b.insert(2, 3)

Question 25

Most performance efficient approach

You may also insert the element using the slice indexing in the list. For example:

>>> a = [1, 2, 4]
>>> insert_at = 2  # Index at which you want to insert item

>>> b = a[:]   # Created copy of list "a" as "b".
               # Skip this step if you are ok with modifying the original list

>>> b[insert_at:insert_at] = [3]  # Insert "3" within "b"
>>> b
[1, 2, 3, 4]

For inserting multiple elements together at a given index, all you need to do is to use a list of multiple elements that you want to insert. For example:

>>> a = [1, 2, 4]
>>> insert_at = 2   # Index starting from which multiple elements will be inserted

# List of elements that you want to insert together at "index_at" (above) position
>>> insert_elements = [3, 5, 6]

>>> a[insert_at:insert_at] = insert_elements
>>> a   # [3, 5, 6] are inserted together in `a` starting at index "2"
[1, 2, 3, 5, 6, 4]

Alternative using list comprehension (but very slow in terms of performance):

As an alternative, it can be achieved using list comprehension with enumerate too. (But please don’t do it this way. It is just for illustration):

>>> a = [1, 2, 4]
>>> insert_at = 2

>>> b = [y for i, x in enumerate(a) for y in ((3, x) if i == insert_at else (x, ))]
>>> b
[1, 2, 3, 4]

Performance comparison of all solutions

Here’s the timeit comparison of all the answers with list of 1000 elements for Python 3.4.5:

Mine answer using sliced insertion – Fastest (3.08 µsec per loop)

 mquadri$ python3 -m timeit -s "a = list(range(1000))" "b = a[:]; b[500:500] = [3]"
 100000 loops, best of 3: 3.08 µsec per loop

ATOzTOA’s accepted answer based on merge of sliced lists – Second (6.71 µsec per loop)

 mquadri$ python3 -m timeit -s "a = list(range(1000))" "b = a[:500] + [3] + a[500:]"
 100000 loops, best of 3: 6.71 µsec per loop

Rushy Panchal’s answer with most votes using list.insert(...)– Third (26.5 usec per loop)

 python3 -m timeit -s "a = list(range(1000))" "b = a[:]; b.insert(500, 3)"
 10000 loops, best of 3: 26.5 µsec per loop

My answer with List Comprehension and enumerate – Fourth (very slow with 168 µsec per loop)

 mquadri$ python3 -m timeit -s "a = list(range(1000))" "[y for i, x in enumerate(a) for y in ((3, x) if i == 500 else (x, )) ]"
 10000 loops, best of 3: 168 µsec per loop

Question 26

The shortest I got: b = a[:2] + [3] + a[2:]

>>>
>>> a = [1, 2, 4]
>>> print a
[1, 2, 4]
>>> b = a[:2] + [3] + a[2:]
>>> print a
[1, 2, 4]
>>> print b
[1, 2, 3, 4]

Question 27

The cleanest approach is to copy the list and then insert the object into the copy. On Python 3 this can be done via list.copy:

new = old.copy()
new.insert(index, value)

On Python 2 copying the list can be achieved via new = old[:] (this also works on Python 3).

In terms of performance there is no difference to other proposed methods:

$ python --version
Python 3.8.1
$ python -m timeit -s "a = list(range(1000))" "b = a.copy(); b.insert(500, 3)"
100000 loops, best of 5: 2.84 µsec per loop
$ python -m timeit -s "a = list(range(1000))" "b = a.copy(); b[500:500] = (3,)"
100000 loops, best of 5: 2.76 µsec per loop

Question 28

Use the Python list insert() method. Usage:

#Syntax

The syntax for the insert() method −

list.insert(index, obj)

#Parameters

index − This is the Index where the object obj need to be inserted.
obj − This is the Object to be inserted into the given list.

#Return Value This method does not return any value, but it inserts the given element at the given index.

Example:

a = [1,2,4,5]

a.insert(2,3)

print(a)

Returns [1, 2, 3, 4, 5]

Python 实用宝典

标签归档：insert

在熊猫数据框中插入一行

问题：在熊猫数据框中插入一行

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

有趣好用的Python教程

问题：在Python列表的第一位置插入[关闭]

回答 0

回答 1

问题：在熊猫数据框中插入一行

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

问题：Python Pandas将列表插入单元格

编辑

EDIT

回答 0

回答 1

回答 2

熊猫> = 0.21

使用at/ 设置单元格值iat

如果得到了ValueError: setting an array element with a sequence怎么办？

可能，但是哈基

Pandas >= 0.21

Setting Cell Values with at/iat

What if I get ValueError: setting an array element with a sequence?

Possible, But Hacky

回答 3

回答 4

回答 5

问题：在列表中的特定索引处插入元素，然后返回更新后的列表

回答 0

回答 1

最高效的方法

所有解决方案的性能比较

Most performance efficient approach

Performance comparison of all solutions

回答 2

回答 3

回答 4

有趣好用的Python教程

使用`at`/ 设置单元格值`iat`

如果得到了`ValueError: setting an array element with a sequence`怎么办？

Setting Cell Values with `at`/`iat`

What if I get `ValueError: setting an array element with a sequence`?