比较主数据帧和子数据帧,并仅基于两个列值提取新行

2024-05-18 06:10:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个数据帧:

主数据框:

Symbol,Strike_Price,C_BidPrice,Pecentage,Margin_Req,Underlay,C_LTP,LotSize
JETAIRWAYS,110.0,1.25,26.0,105308.9,81.05,1.2,2200
JETAIRWAYS,120.0,1.0,32.0,96156.9,81.05,1.15,2200
PCJEWELLER,77.5,0.95,27.0,171217.0,56.95,1.3,6500
PCJEWELLER,80.0,0.8,29.0,161207.0,56.95,0.95,6500
PCJEWELLER,82.5,0.55,31.0,154772.0,56.95,0.95,6500
PCJEWELLER,85.0,0.6,33.0,147882.0,56.95,0.7,6500
PCJEWELLER,90.0,0.5,37.0,138977.0,56.95,0.55,6500

和孩子们:

Symbol,Strike_Price,C_BidPrice,Pecentage,Margin_Req,Underlay,C_LTP,LotSize
JETAIRWAYS,110.0,1.25,26.0,105308.9,81.05,1.2,2200
JETAIRWAYS,150.0,1.3,22.0,44156.9,81.05,1.05,2200
PCJEWELLER,77.5,0.95,27.0,171217.0,56.95,1.3,6500
PCJEWELLER,100.0,1.8,29.0,441207.0,46.95,4.95,6500

我想根据列(Symbol,Strike\u Price)比较child\u DF和master\u DF,即如果Symbol&;主数据框中已提供了执行价格,则不会将其视为新数据。

新行是:

Symbol,Strike_Price,C_BidPrice,Pecentage,Margin_Req,Underlay,C_LTP,LotSize
JETAIRWAYS,150.0,1.3,22.0,44156.9,81.05,1.05,2200
PCJEWELLER,100.0,1.8,29.0,441207.0,46.95,4.95,6500

Tags: 数据margindf孩子symbolreqpricestrike
2条回答
  1. 首先合并symbol上的数据帧和strike\u price setting indicator=True和how='right'

result = pd.merge(master_df[['Symbol','Strike_Price']],child_df,on=['Symbol','Strike_Price'],indicator=True,how='right')

  1. 然后只从合并列中筛选右\u以获得所需的结果

    result = result[result['_merge']=='right_only']

    Code snippet

可以将right^{}indicator=True一起使用,然后^{}“right\u only”,最后^{}按子级顺序获取列:

(master.merge(child,on=['Symbol','Strike_Price'],how='right',
          suffixes=('_',''),indicator=True)
    .query('_merge=="right_only"')).reindex(child.columns,axis=1)

       Symbol  Strike_Price  C_BidPrice  Pecentage  Margin_Req  Underlay  \
2  JETAIRWAYS         150.0         1.3       22.0     44156.9     81.05   
3  PCJEWELLER         100.0         1.8       29.0    441207.0     46.95   

   C_LTP  LotSize  
2   1.05     2200  
3   4.95     6500  

相关问题 更多 >