问题:将列追加到熊猫数据框

这可能很容易,但是我有以下数据:

在数据框1中:

index dat1
0     9
1     5

在数据框2中:

index dat2
0     7
1     6

我想要一个具有以下形式的数据框:

index dat1  dat2
0     9     7
1     5     6

我尝试使用该append方法,但是得到了交叉连接(即笛卡尔积)。

什么是正确的方法?

This is probably easy, but I have the following data:

In data frame 1:

index dat1
0     9
1     5

In data frame 2:

index dat2
0     7
1     6

I want a data frame with the following form:

index dat1  dat2
0     9     7
1     5     6

I’ve tried using the append method, but I get a cross join (i.e. cartesian product).

What’s the right way to do this?


回答 0

通常看来,您只是在寻找联接:

> dat1 = pd.DataFrame({'dat1': [9,5]})
> dat2 = pd.DataFrame({'dat2': [7,6]})
> dat1.join(dat2)
   dat1  dat2
0     9     7
1     5     6

It seems in general you’re just looking for a join:

> dat1 = pd.DataFrame({'dat1': [9,5]})
> dat2 = pd.DataFrame({'dat2': [7,6]})
> dat1.join(dat2)
   dat1  dat2
0     9     7
1     5     6

回答 1

您还可以使用:

dat1 = pd.concat([dat1, dat2], axis=1)

You can also use:

dat1 = pd.concat([dat1, dat2], axis=1)

回答 2

两者join()concat()方法都可以解决问题。但是,我不得不提一个警告:在您之前join()或者concat()如果您试图通过从另一个DataFrame中选择一些行来处理某个数据框架,请重置索引。

下面的一个示例显示了join和concat的一些有趣行为:

dat1 = pd.DataFrame({'dat1': range(4)})
dat2 = pd.DataFrame({'dat2': range(4,8)})
dat1.index = [1,3,5,7]
dat2.index = [2,4,6,8]

# way1 join 2 DataFrames
print(dat1.join(dat2))
# output
   dat1  dat2
1     0   NaN
3     1   NaN
5     2   NaN
7     3   NaN

# way2 concat 2 DataFrames
print(pd.concat([dat1,dat2],axis=1))
#output
   dat1  dat2
1   0.0   NaN
2   NaN   4.0
3   1.0   NaN
4   NaN   5.0
5   2.0   NaN
6   NaN   6.0
7   3.0   NaN
8   NaN   7.0

#reset index 
dat1 = dat1.reset_index(drop=True)
dat2 = dat2.reset_index(drop=True)
#both 2 ways to get the same result

print(dat1.join(dat2))
   dat1  dat2
0     0     4
1     1     5
2     2     6
3     3     7


print(pd.concat([dat1,dat2],axis=1))
   dat1  dat2
0     0     4
1     1     5
2     2     6
3     3     7

Both join() and concat() way could solve the problem. However, there is one warning I have to mention: Reset the index before you join() or concat() if you trying to deal with some data frame by selecting some rows from another DataFrame.

One example below shows some interesting behavior of join and concat:

dat1 = pd.DataFrame({'dat1': range(4)})
dat2 = pd.DataFrame({'dat2': range(4,8)})
dat1.index = [1,3,5,7]
dat2.index = [2,4,6,8]

# way1 join 2 DataFrames
print(dat1.join(dat2))
# output
   dat1  dat2
1     0   NaN
3     1   NaN
5     2   NaN
7     3   NaN

# way2 concat 2 DataFrames
print(pd.concat([dat1,dat2],axis=1))
#output
   dat1  dat2
1   0.0   NaN
2   NaN   4.0
3   1.0   NaN
4   NaN   5.0
5   2.0   NaN
6   NaN   6.0
7   3.0   NaN
8   NaN   7.0

#reset index 
dat1 = dat1.reset_index(drop=True)
dat2 = dat2.reset_index(drop=True)
#both 2 ways to get the same result

print(dat1.join(dat2))
   dat1  dat2
0     0     4
1     1     5
2     2     6
3     3     7


print(pd.concat([dat1,dat2],axis=1))
   dat1  dat2
0     0     4
1     1     5
2     2     6
3     3     7

回答 3

事实上:

data_joined = dat1.join(dat2)
print(data_joined)

Just as a matter of fact:

data_joined = dat1.join(dat2)
print(data_joined)

回答 4

只是正确的Google搜索问题:

data = dat_1.append(dat_2)
data = data.groupby(data.index).sum()

Just a matter of the right google search:

data = dat_1.append(dat_2)
data = data.groupby(data.index).sum()

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。