检查一个数据帧的项是否在另一个数据帧中定义的范围内,并且具有相同的索引

2024-06-24 13:39:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个从文件创建的数据帧: enter image description hereenter image description here

我想检查由DOY installedDOY removed列组成的哪个范围是Bias start列中的值,但需要在由两个dfs中匹配的Station ID索引组成的组中进行。之后,我想创建第三个数据帧,它将由第二个df中的所有列和根据范围条件选择的Receiver type组成。以下是期望的输出:

enter image description here

和代码:

'input: df1, df2'
df1 = pd.DataFrame([['ABMF', 'ASTECH', 'GPS', '2008-07-15', '2009-10-15', 2008.20, 2009.29],
                    ['ABMF', 'LEICA', 'GPS+GLO', '2009-10-15', '2011-11-15', 2009.29, 2011.32],
                    ['ABMF', 'SEPT', 'GPS+GLO', '2011-11-15', '2015-04-28', 2011.32, 2015.12],
                    ['ABMF', 'TRIMBLE', 'GPS', '2015-04-28', '2019-04-15', 2015.12, 2019.11],
                    ['ZIMM', 'ASTECH', 'GPS', '1993-05-01', '1997-08-06', 1993.12, 1997.22],
                    ['ZIMM', 'SEPT', 'GPS', '1997-08-06', '2003-08-12', 1997.22, 2003.22],
                    ['ZIMM', 'TRIMBLE', 'GPS', '2003-08-12', '2015-04-27', 2003.22, 2015.12]],
                    columns=['Station ID','Receiver type','Satellite system','Date installed', 
                    'Date removed','DOY installed','DOY removed'])
df1.set_index(['Station ID','Receiver type'], inplace=True)

df2 = pd.DataFrame([['ABMF', 'C1P', 'C2P', 2013.09, 2013.09, -1.25, 0.15],
                    ['ABMF', 'C2W', 'C2X', 2013.10, 2013.10, -1.1, 0.1],
                    ['ABMF', 'C2C', 'C2P', 2013.14, 2013.14, -1.115, 0.123],
                    ['ABMF', 'C2W', 'C2X', 2013.22, 2013.22, -1.23, 0.12],
                    ['ABMF', 'C2W', 'C2X', 2013.42, 2013.42, -1.7, 0.124],
                    ['ZIMM', 'C2W', 'C2X', 2013.10, 2013.10, -1.21, 0.11],
                    ['ZIMM', 'C2W', 'C2X', 2013.12, 2013.12, -1.14, 0.11],
                    ['ZIMM', 'C2W', 'C2X', 2013.14, 2013.14, -1.41, 0.31]],
                    columns=['Station ID','OBS1','OBS2','Bias start','Bias end','Value','Std'])
df2.set_index('Station ID', inplace=True)

'desired output: df3'
df3 = pd.DataFrame([['ABMF', 'C1P', 'C2P', 2013.09, 2013.09, -1.25, 0.15, 'SEPT'],
                    ['ABMF', 'C2W', 'C2X', 2013.10, 2013.10, -1.1, 0.1, 'SEPT'],
                    ['ABMF', 'C2C', 'C2P', 2013.14, 2013.14, -1.115, 0.123, 'SEPT'],
                    ['ABMF', 'C2W', 'C2X', 2013.22, 2013.22, -1.23, 0.12, 'SEPT'],
                    ['ABMF', 'C2W', 'C2X', 2013.42, 2013.42, -1.7, 0.124, 'SEPT'],
                    ['ZIMM', 'C2W', 'C2X', 2013.10, 2013.10, -1.21, 0.11, 'TRIMBLE'],
                    ['ZIMM', 'C2W', 'C2X', 2013.12, 2013.12, -1.14, 0.11, 'TRIMBLE'],
                    ['ZIMM', 'C2W', 'C2X', 2013.14, 2013.14, -1.41, 0.31, 'TRIMBLE']],
                    columns=['Station ID','OBS1','OBS2','Bias start','Bias end','Value','Std', 'Receiver type'])
df3.set_index('Station ID', inplace=True)

Tags: installedidtypegpsreceiverstationbiasc2p
1条回答
网友
1楼 · 发布于 2024-06-24 13:39:10

它是一个^{}操作,使用by参数为每个StationID工作。请注意,要执行此操作,需要对使用的列进行排序。其余的都是装饰性的,以满足预期的产出

df_ = (pd.merge_asof(
              df2.reset_index().sort_values(by='Bias start'),
              df1.reset_index().sort_values(by='DOY installed'), 
              by='Station ID',
              left_on='Bias start', right_on='DOY installed', 
              direction='backward'
              )
          [['Station ID'] + df2.columns.tolist() + ['Reciever type']]
          .sort_values(by=['Station ID', 'Bias start'])
          .set_index('Station ID')
      )
print(df_)
           OBS1 OBS2  Bias start  Bias end  Value    Std Reciever type
Station ID                                                            
ABMF        C1P  C2P     2013.09   2013.09 -1.250  0.150          SEPT
ABMF        C2W  C2X     2013.10   2013.10 -1.100  0.100          SEPT
ABMF        C2C  C2P     2013.14   2013.14 -1.115  0.123          SEPT
ABMF        C2W  C2X     2013.22   2013.22 -1.230  0.120          SEPT
ABMF        C2W  C2X     2013.42   2013.42 -1.700  0.124          SEPT
ZIMM        C2W  C2X     2013.10   2013.10 -1.210  0.110       TRIMBLE
ZIMM        C2W  C2X     2013.12   2013.12 -1.140  0.110       TRIMBLE
ZIMM        C2W  C2X     2013.14   2013.14 -1.410  0.310       TRIMBLE

相关问题 更多 >