将另一个数据帧合并到现有行

2024-10-03 04:27:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个数据帧dfsubs,如下所示:

df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydney", np.nan]})

   scode sname  sub1    sub2
0   11   aa     London  NaN
1   22   bb     NaN     NaN
2   33   cc     Delhi   Sydney
3   44   dd     NaN     NaN

subs = {0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]}

    0   1               2
0   22  Milford Sound   Oslo
1   44  Queenstown      NaN

如何合并2个数据帧并最终得到如下结果:

    scode   sname   sub1            sub2
0   11      aa      London          NaN
1   22      bb      Milford Sound   Oslo
2   33      cc      Delhi           Sydney
3   44      dd      Queenstown      NaN

Tags: npnanddaacclondonbbsydney
2条回答

首先,让您的列名匹配:

newSub = sub.rename(columns={0:'scode', 1:'sub1', 2:'sub2'})

接下来,dataframe的update方法根据源行和目标行之间的公共索引执行所需操作。那么,让我们将索引设置为scode:

indexedDF     = df.set_index('scode')
indexedNewSub = newSub.set_index('scode')

最后,使用indexedDF的updated方法就地更新:

indexedDF.update(indexedNewSub)

indexedDF现在应按请求将subs合并到中。你知道吗

Pandas将自动在索引/列上对齐,只要确保设置了正确的索引,假设scode是您希望合并内容的方式:

In [5]: df = pd.DataFrame({"scode": [11, 22, 33, 44], "sname": ["aa", "bb", "cc", "dd"], "sub1": [ "London", np.nan, "Delhi", np.nan], "sub2": [np.nan, np.nan, "Sydne
    ...: y", np.nan]})
    ...:

In [6]: df.set_index('scode',inplace=True)

In [7]: subs = pd.DataFrame({0: [22, 44], 1: ["Milford Sound", "Queenstown"], 2: ["Oslo", np.nan]})
    ...:

In [8]: subs.set_index(0, inplace=True)

In [9]: subs.columns=['sub1','sub2']

给你这样的东西:

In [10]: df
Out[10]:
      sname    sub1    sub2
scode
11       aa  London     NaN
22       bb     NaN     NaN
33       cc   Delhi  Sydney
44       dd     NaN     NaN

In [11]: subs
Out[11]:
             sub1  sub2
0
22  Milford Sound  Oslo
44     Queenstown   NaN

现在,只需执行普通赋值,选择适当的列/索引:

In [12]: df.loc[subs.index.values,['sub1', 'sub2']] = subs

In [13]: df
Out[13]:
      sname           sub1    sub2
scode
11       aa         London     NaN
22       bb  Milford Sound    Oslo
33       cc          Delhi  Sydney
44       dd     Queenstown     NaN

始终可以重置以前使用的索引:

In [14]: df.reset_index(inplace=True)

In [15]: df
Out[15]:
   scode sname           sub1    sub2
0     11    aa         London     NaN
1     22    bb  Milford Sound    Oslo
2     33    cc          Delhi  Sydney
3     44    dd     Queenstown     NaN

相关问题 更多 >