在列中找到一些值

2024-09-29 02:23:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我有excel1

member_id   panel_ank_id    panel_mm_id
20759   14bc1a5dee9ccb37d120e118f84def7c    32e5e5874b5f8ef06d653c3bb8a28483
33853   91d8723b691a7297984ff1621ca6ee59    b23f6b2511edc3688a3da861ca9cd209
36554   0                                   fb4dcaaffa9e6ae0d01cae8aebc3c189
38639   683476470d39b0644a9bb4936a14fcd1    db69040be32b7a53fa884c6d8ff689fa
85992   245c2ee8839c274ec1b536ce6afe5ec8    9be78f6429882309862731c834202991

我有excel2

00102b98bd9e71da3cf23fd1f599408d
00108f5c5de701ac4386e717a4d07d5b
0012ea90a6deb4eeb2924fb13e844136
001342afb153e2775649dc5ae0460605
00443c1fed7a99ac7a33a670af5a20c1

我想检查它是否在excel1打印到这个值member_id


Tags: idmembermmpanelexcel1excel2ankfb4dcaaffa9e6ae0d01cae8aebc3c189
1条回答
网友
1楼 · 发布于 2024-09-29 02:23:19

IIUC您需要在列panel_mm_id上使用inner joinhow='inner')省略什么,因为默认情况下:

df2.columns = ['panel_mm_id']

df = (pd.merge(df1, df2, on='panel_mm_id'))
print (df)

示例(更改了panel_mm_id列中df1的值):

import pandas as pd

df1 = pd.DataFrame({'panel_mm_id': {0: '00102b98bd9e71da3cf23fd1f599408d', 1: 'b23f6b2511edc3688a3da861ca9cd209', 2: 'fb4dcaaffa9e6ae0d01cae8aebc3c189', 3: 'db69040be32b7a53fa884c6d8ff689fa', 4: '9be78f6429882309862731c834202991'}, 'member_id': {0: 20759, 1: 33853, 2: 36554, 3: 38639, 4: 85992}, 'panel_ank_id': {0: '14bc1a5dee9ccb37d120e118f84def7c', 1: '91d8723b691a7297984ff1621ca6ee59', 2: '0', 3: '683476470d39b0644a9bb4936a14fcd1', 4: '245c2ee8839c274ec1b536ce6afe5ec8'}})
df2 = pd.DataFrame({0: {0: '00102b98bd9e71da3cf23fd1f599408d', 1: '00108f5c5de701ac4386e717a4d07d5b', 2: '0012ea90a6deb4eeb2924fb13e844136', 3: '001342afb153e2775649dc5ae0460605', 4: '00443c1fed7a99ac7a33a670af5a20c1'}})  
print (df1)
   member_id                      panel_ank_id  \
0      20759  14bc1a5dee9ccb37d120e118f84def7c   
1      33853  91d8723b691a7297984ff1621ca6ee59   
2      36554                                 0   
3      38639  683476470d39b0644a9bb4936a14fcd1   
4      85992  245c2ee8839c274ec1b536ce6afe5ec8   

                        panel_mm_id  
0  00102b98bd9e71da3cf23fd1f599408d  
1  b23f6b2511edc3688a3da861ca9cd209  
2  fb4dcaaffa9e6ae0d01cae8aebc3c189  
3  db69040be32b7a53fa884c6d8ff689fa  
4  9be78f6429882309862731c834202991  

print (df2)
                                  0
0  00102b98bd9e71da3cf23fd1f599408d
1  00108f5c5de701ac4386e717a4d07d5b
2  0012ea90a6deb4eeb2924fb13e844136
3  001342afb153e2775649dc5ae0460605
4  00443c1fed7a99ac7a33a670af5a20c1

df2.columns = ['panel_mm_id']

df = (pd.merge(df1, df2, on='panel_mm_id'))
print (df)
   member_id                      panel_ank_id  \
0      20759  14bc1a5dee9ccb37d120e118f84def7c   

                        panel_mm_id  
0  00102b98bd9e71da3cf23fd1f599408d  

如果需要按两列进行比较panel_mm_idpanel_ank_id并且df1只有3列,请使用^{}

df2.columns = ['a']

df1 = pd.melt(df1, id_vars='member_id', value_name='a').drop('variable', axis=1)
print (df1)
   member_id                                 a
0      20759  14bc1a5dee9ccb37d120e118f84def7c
1      33853  91d8723b691a7297984ff1621ca6ee59
2      36554                                 0
3      38639  683476470d39b0644a9bb4936a14fcd1
4      85992  245c2ee8839c274ec1b536ce6afe5ec8
5      20759  00102b98bd9e71da3cf23fd1f599408d
6      33853  b23f6b2511edc3688a3da861ca9cd209
7      36554  fb4dcaaffa9e6ae0d01cae8aebc3c189
8      38639  db69040be32b7a53fa884c6d8ff689fa
9      85992  9be78f6429882309862731c834202991

df = (pd.merge(df1, df2, on='a'))
print (df)
   member_id                                 a
0      20759  00102b98bd9e71da3cf23fd1f599408d

相关问题 更多 >