如何将空列添加到数据框？

Question 1

What’s the easiest way to add an empty column to a pandas DataFrame object? The best I’ve stumbled upon is something like

df['foo'] = df.apply(lambda _: '', axis=1)

Is there a less perverse method?

Question 2

If I understand correctly, assignment should fill:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
   A  B
0  1  2
1  2  3
2  3  4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
   A  B C   D
0  1  2   NaN
1  2  3   NaN
2  3  4   NaN

Question 3

To add to DSM’s answer and building on this associated question, I’d split the approach into two cases:

Adding a single column: Just assign empty values to the new columns, e.g. df['C'] = np.nan
Adding multiple columns: I’d suggest using the .reindex(columns=[...]) method of pandas to add the new columns to the dataframe’s column index. This also works for adding multiple new rows with .reindex(rows=[...]). Note that newer versions of Pandas (v>0.20) allow you to specify an axis keyword rather than explicitly assigning to columns or rows.

Here is an example adding multiple columns:

mydf = mydf.reindex(columns = mydf.columns.tolist() + ['newcol1','newcol2'])

or

mydf = mydf.reindex(mydf.columns.tolist() + ['newcol1','newcol2'], axis=1)  # version > 0.20.0

You can also always concatenate a new (empty) dataframe to the existing dataframe, but that doesn’t feel as pythonic to me :)

Question 4

an even simpler solution is:

df = df.reindex(columns = header_list)

where “header_list” is a list of the headers you want to appear.

any header included in the list that is not found already in the dataframe will be added with blank cells below.

so if

header_list = ['a','b','c', 'd']

then c and d will be added as columns with blank cells

Question 5

Starting with v0.16.0, DF.assign() could be used to assign new columns (single/multiple) to a DF. These columns get inserted in alphabetical order at the end of the DF.

This becomes advantageous compared to simple assignment in cases wherein you want to perform a series of chained operations directly on the returned dataframe.

Consider the same DF sample demonstrated by @DSM:

df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
df
Out[18]:
   A  B
0  1  2
1  2  3
2  3  4

df.assign(C="",D=np.nan)
Out[21]:
   A  B C   D
0  1  2   NaN
1  2  3   NaN
2  3  4   NaN

Note that this returns a copy with all the previous columns along with the newly created ones. In order for the original DF to be modified accordingly, use it like : df = df.assign(...) as it does not support inplace operation currently.

Question 6

I like:

df['new'] = pd.Series(dtype='your_required_dtype')

If you have an empty dataframe, this solution makes sure that no new row containing only NaN is added.

Specifying dtype is not strictly necessary, however newer Pandas versions produce a DeprecationWarning if not specified.

Question 7

if you want to add column name from a list

df=pd.DataFrame()
a=['col1','col2','col3','col4']
for i in a:
    df[i]=np.nan

Question 8

@emunsing’s answer is really cool for adding multiple columns, but I couldn’t get it to work for me in python 2.7. Instead, I found this works:

mydf = mydf.reindex(columns = np.append( mydf.columns.values, ['newcol1','newcol2'])

Question 9

The below code address the question “How do I add n number of empty columns to my existing dataframe”. In the interest of keeping solutions to similar problems in one place, I am adding it here.

Approach 1 (to create 64 additional columns with column names from 1-64)

m = list(range(1,65,1)) 
dd=pd.DataFrame(columns=m)
df.join(dd).replace(np.nan,'') #df is the dataframe that already exists

Approach 2 (to create 64 additional columns with column names from 1-64)

df.reindex(df.columns.tolist() + list(range(1,65,1)), axis=1).replace(np.nan,'')

Question 10

You can do

df['column'] = None #This works. This will create a new column with None type
df.column = None #This will work only when the column is already present in the dataframe

Question 11

One can use df.insert(index_to_insert_at, column_header, init_value) to insert new column at a specific index.

cost_tbl.insert(1, "col_name", "")

The above statement would insert an empty Column after the first column.

如何将空列添加到数据框？

问题：如何将空列添加到数据框？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

Bash等同于Python的pass语句

Python集与列表

结合FOR循环和IF语句的Python方法

virtualenv –no-site-packages和pip仍在查找全局软件包吗？

将新行附加到旧的csv文件python

如何删除字符串中的前导和尾随零？Python

如何将空列添加到数据框？

问题：如何将空列添加到数据框？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

相关文章

排行榜展示

文章展示