如何合并数据帧,同时从第一个数据帧中删除具有相同索引的行?

2024-10-02 12:31:10 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有两个数据帧,df1和df2

subject_id first_name last_name        
1                Alex  Anderson
2                 Amy  Ackerman
3               Allen       Ali
4               Alice      Aoni
5              Ayoung   Atiches

subject_id first_name last_name
4               Billy    Bonder
5               Brian     Black
6                Bran   Balwner
7               Bryce     Brice
8               Betty    Btisan

假设他们的索引是subject\u id,我如何得到以下内容:

subject_id first_name last_name        
1                Alex  Anderson
2                 Amy  Ackerman
3               Allen       Ali
4               Billy    Bonder
5               Brian     Black
6                Bran   Balwner
7               Bryce     Brice
8               Betty    Btisan

当我在这里的时候,如何得到这个:

subject_id first_name last_name        
1                Alex  Anderson
2                 Amy  Ackerman
3               Allen       Ali
4               Alice      Aoni
5              Ayoung   Atiches
6                Bran   Balwner
7               Bryce     Brice
8               Betty    Btisan

Tags: nameidaliandersonfirstlastsubjectalex
2条回答

我们可以使用pd.concatdrop_duplicates,(抱歉,似乎是这样,隐藏它们的格式……,这会使答案难看……)

pd.concat([df1,df2]).drop_duplicates('subject_id',keep='first')

Out[95]: subject_id first_name last_name 0 1 Alex Anderson 1 2 Amy Ackerman 2 3 Allen Ali 3 4 Alice Aoni 4 5 Ayoung Atiches 2 6 Bran Balwner 3 7 Bryce Brice 4 8 Betty Btisanpd.concat([df1,df2]).drop_duplicates('subject_id',keep='last')

Out[96]: subject_id first_name last_name 0 1 Alex Anderson 1 2 Amy Ackerman 2 3 Allen Ali 0 4 Billy Bonder 1 5 Brian Black 2 6 Bran Balwner 3 7 Bryce Brice 4 8 Betty Btisan

使用^{},必要时先使用^{}

df11 = df1.set_index('subject_id')
df22 = df2.set_index('subject_id')

df3 = df22.combine_first(df11).reset_index()
print (df3)
   subject_id first_name last_name
0           1       Alex  Anderson
1           2        Amy  Ackerman
2           3      Allen       Ali
3           4      Billy    Bonder
4           5      Brian     Black
5           6       Bran   Balwner
6           7      Bryce     Brice
7           8      Betty    Btisan

df3 = df11.combine_first(df22).reset_index()
print (df3)
   subject_id first_name last_name
0           1       Alex  Anderson
1           2        Amy  Ackerman
2           3      Allen       Ali
3           4      Alice      Aoni
4           5     Ayoung   Atiches
5           6       Bran   Balwner
6           7      Bryce     Brice
7           8      Betty    Btisan

相关问题 更多 >

    热门问题