检查一个数据帧的项是否在另一个数据帧中定义的范围内，并且具有相同的索引

'input: df1, df2' df1 = pd.DataFrame([['ABMF', 'ASTECH', 'GPS', '2008-07-15', '2009-10-15', 2008.20, 2009.29], ['ABMF', 'LEICA', 'GPS+GLO', '2009-10-15', '2011-11-15', 2009.29, 2011.32], ['ABMF', 'SEPT', 'GPS+GLO', '2011-11-15', '2015-04-28', 2011.32, 2015.12], ['ABMF', 'TRIMBLE', 'GPS', '2015-04-28', '2019-04-15', 2015.12, 2019.11], ['ZIMM', 'ASTECH', 'GPS', '1993-05-01', '1997-08-06', 1993.12, 1997.22], ['ZIMM', 'SEPT', 'GPS', '1997-08-06', '2003-08-12', 1997.22, 2003.22], ['ZIMM', 'TRIMBLE', 'GPS', '2003-08-12', '2015-04-27', 2003.22, 2015.12]], columns=['Station ID','Receiver type','Satellite system','Date installed', 'Date removed','DOY installed','DOY removed']) df1.set_index(['Station ID','Receiver type'], inplace=True) df2 = pd.DataFrame([['ABMF', 'C1P', 'C2P', 2013.09, 2013.09, -1.25, 0.15], ['ABMF', 'C2W', 'C2X', 2013.10, 2013.10, -1.1, 0.1], ['ABMF', 'C2C', 'C2P', 2013.14, 2013.14, -1.115, 0.123], ['ABMF', 'C2W', 'C2X', 2013.22, 2013.22, -1.23, 0.12], ['ABMF', 'C2W', 'C2X', 2013.42, 2013.42, -1.7, 0.124], ['ZIMM', 'C2W', 'C2X', 2013.10, 2013.10, -1.21, 0.11], ['ZIMM', 'C2W', 'C2X', 2013.12, 2013.12, -1.14, 0.11], ['ZIMM', 'C2W', 'C2X', 2013.14, 2013.14, -1.41, 0.31]], columns=['Station ID','OBS1','OBS2','Bias start','Bias end','Value','Std']) df2.set_index('Station ID', inplace=True) 'desired output: df3' df3 = pd.DataFrame([['ABMF', 'C1P', 'C2P', 2013.09, 2013.09, -1.25, 0.15, 'SEPT'], ['ABMF', 'C2W', 'C2X', 2013.10, 2013.10, -1.1, 0.1, 'SEPT'], ['ABMF', 'C2C', 'C2P', 2013.14, 2013.14, -1.115, 0.123, 'SEPT'], ['ABMF', 'C2W', 'C2X', 2013.22, 2013.22, -1.23, 0.12, 'SEPT'], ['ABMF', 'C2W', 'C2X', 2013.42, 2013.42, -1.7, 0.124, 'SEPT'], ['ZIMM', 'C2W', 'C2X', 2013.10, 2013.10, -1.21, 0.11, 'TRIMBLE'], ['ZIMM', 'C2W', 'C2X', 2013.12, 2013.12, -1.14, 0.11, 'TRIMBLE'], ['ZIMM', 'C2W', 'C2X', 2013.14, 2013.14, -1.41, 0.31, 'TRIMBLE']], columns=['Station ID','OBS1','OBS2','Bias start','Bias end','Value','Std', 'Receiver type']) df3.set_index('Station ID', inplace=True)

1条回答

网友

1楼 · 发布于 2024-06-24 13:39:10

它是一个^{}操作，使用by参数为每个StationID工作。请注意，要执行此操作，需要对使用的列进行排序。其余的都是装饰性的，以满足预期的产出

df_ = (pd.merge_asof(
              df2.reset_index().sort_values(by='Bias start'),
              df1.reset_index().sort_values(by='DOY installed'), 
              by='Station ID',
              left_on='Bias start', right_on='DOY installed', 
              direction='backward'
              )
          [['Station ID'] + df2.columns.tolist() + ['Reciever type']]
          .sort_values(by=['Station ID', 'Bias start'])
          .set_index('Station ID')
      )
print(df_)
           OBS1 OBS2  Bias start  Bias end  Value    Std Reciever type
Station ID                                                            
ABMF        C1P  C2P     2013.09   2013.09 -1.250  0.150          SEPT
ABMF        C2W  C2X     2013.10   2013.10 -1.100  0.100          SEPT
ABMF        C2C  C2P     2013.14   2013.14 -1.115  0.123          SEPT
ABMF        C2W  C2X     2013.22   2013.22 -1.230  0.120          SEPT
ABMF        C2W  C2X     2013.42   2013.42 -1.700  0.124          SEPT
ZIMM        C2W  C2X     2013.10   2013.10 -1.210  0.110       TRIMBLE
ZIMM        C2W  C2X     2013.12   2013.12 -1.140  0.110       TRIMBLE
ZIMM        C2W  C2X     2013.14   2013.14 -1.410  0.310       TRIMBLE

相关问题更多 >

编程相关推荐

热门问题

热门文章