Python pandas：替换多个与另一个datafram中的多个列匹配的值

chr snp x pos a1 a2 1 rs376643643 0 10040 G A 1 rs373328635 0 10066 C G 1 rs62651026 0 10208 C G 1 rs376007522 0 10209 C G 3 rs368469931 0 30247 C T

df1.loc[(df1.chr==df2.OCHR) & (df1.pos==df2.OSTOP),["snp", "chr", "pos"]] = df2.loc[df2[['OCHR', 'OSTOP']] == df1.loc[(df1.chr==df2.OCHR) & (df1.pos==df2.OSTOP),["chr", "pos"]],['ID', ''CHR', 'STOP']].values

2条回答

网友

1楼 · 编辑于 2024-10-01 13:27:32

首先重新命名要在df2中合并的列

df2.rename(columns={'OCHR':'chr','OSTOP':'pos'},inplace=True)

现在合并这些列

^{pr2}$

接下来，你想

updater = df_merged[['D','CHR','STOP']] #this will be your update frame
updater.rename( columns={'D':'snp','CHR':'chr','STOP':'pos'},inplace=True) # rename columns to update original

最后更新（见this link底部）：

df1.update( df1_updater) #updates in place
#  chr          snp  x    pos a1 a2
#0   1  rs376643643  0  10040  G  A
#1   1  rs373328635  0  10066  C  G
#2   1   rs62651026  0  10208  C  G
#3   1  rs376007522  0  10209  C  G
#4   3  rs368469931  0  30247  C  T

更新是通过匹配索引/列来工作的，因此您可能必须在整个进程中沿着df1的索引字符串，然后在df1.update(df1_updater)之前执行df1_updater.re_index(...

网友

2楼 · 编辑于 2024-10-01 13:27:32

您可以使用update函数（需要将匹配条件设置为index）。我修改了你的样本数据，允许有一些不匹配。在

# your data
# =====================
# df1 pos is modified from 10020 to 10010
print(df1)

   chr      snp  x    pos a1 a2
0    1  1-10020  0  10010  G  A
1    1  1-10056  0  10056  C  G
2    1  1-10108  0  10108  C  G
3    1  1-10109  0  10109  C  G
4    1  1-10139  0  10139  C  T

print(df2)

            ID  CHR   STOP  OCHR  OSTOP
0  rs376643643    1  10040     1  10020
1  rs373328635    1  10066     1  10056
2   rs62651026    1  10208     1  10108
3  rs376007522    1  10209     1  10109
4  rs368469931    3  30247     1  10139

# processing
# ==========================
# set matching columns to multi-level index
x1 = df1.set_index(['chr', 'pos'])['snp']
x2 = df2.set_index(['OCHR', 'OSTOP'])['ID']
# call update function, this is inplace
x1.update(x2)
# replace the values in original df1
df1['snp'] = x1.values
print(df1)

   chr          snp  x    pos a1 a2
0    1      1-10020  0  10010  G  A
1    1  rs373328635  0  10056  C  G
2    1   rs62651026  0  10108  C  G
3    1  rs376007522  0  10109  C  G
4    1  rs368469931  0  10139  C  T

相关问题更多 >

编程相关推荐

热门问题

热门文章