创建匹配两列以上的成对数据框

2024-10-03 21:34:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在使用以下代码将一个数据帧列中的每个值与另一个数据帧列中的每个值进行匹配:

new_df = pd.DataFrame(product(df1['CompanyA'], df2['CompanyB']), columns=["CompanyA","CompanyB"]) 

new_df_address1 = pd.DataFrame(product(df1['Address1A'], df2['Address1B']), columns=["Address1A","Address1B"]) 

new_df_postcode = pd.DataFrame(product(df1['PostcodeA'], df2['PostcodeB']), columns=["PostcodeA","PostcodeB"]) 

(and a few more pairs with the same code)

我想做的是,在CompanyA和CompanyB的初始配对中,还要执行Address1A、Address1B、PostcodeA、PostcodeB等,以创建一个包含所有信息的数据帧

我希望避免单独计算每一个,并将它们附加到另一个,以防计算顺序出现混淆

谢谢

编辑:数据样本

df1:
CompanyA      Address1A        Address2A   PostcodeA ...
Trees inc.    1 Hill Street    London      FH5 8YB

df2:

CompanyB      Address1B        Address2B   PostcodeB ...
Boxes inc.    4 High Street    York        AK5 FJ6
Hats inc.     17 River Lane    Bolton      YT5 9NB

成对df:

CompanyA      Address1A        Address2A   PostcodeA  CompanyB      Address1B        Address2B   PostcodeB ...
Trees inc.    1 Hill Street    London      FH5 8YB    Boxes inc.    4 High Street    York        AK5 FJ6
Trees inc.    1 Hill Street    London      FH5 8YB    Hats inc.     17 River Lane    Bolton      YT5 9NB 
etc

所需的输出是映射到df2中每一行的df1中的每一行

谢谢


Tags: 数据streetdataframedfnewincpddf1
1条回答
网友
1楼 · 发布于 2024-10-03 21:34:52

IIUC,你只需要在右键上做数据帧的笛卡尔积

df = pd.merge(df1.assign(key="var1"), df2.assign(key="var1"), on="key", how="right").drop(
    "key", 1
)

print(df)

     CompanyA      Address1A Address2A PostcodeA    CompanyB      Address1B  \
0  Trees inc.  1 Hill Street    London   FH5 8YB  Boxes inc.  4 High Street   
1  Trees inc.  1 Hill Street    London   FH5 8YB   Hats inc.  17 River Lane   

  Address2B PostcodeB  
0      York   AK5 FJ6  
1    Bolton   YT5 9NB  

相关问题 更多 >