在python中使用Fuzzymatcher时,如何确定截止值或阈值

2024-09-27 22:19:03 发布

您现在位置:Python中文网/ 问答频道 /正文

请在照片上提供帮助,这是我的输出和代码的屏幕截图,我如何使用最佳匹配分数我需要通过返回的“精度分数”进行过滤。该列仅在合并后出现(即,只需在-1.06下方返回带有最佳匹配分数的所有内容)

import fuzzymatcher
import pandas as pd
import os

# pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

REDCAP = pd.read_csv(r"C:\Users\Selamola\Desktop\PythonThings\FuzzyMatching\REDCAP Form A v1 and v2 23 Feb 211.csv")
covidSheet = pd.read_csv(r"C:\Users\Selamola\Desktop\PythonThings\FuzzyMatching\Cases missing REC ID 23 Feb 211.csv")

Data_merge = fuzzymatcher.fuzzy_left_join(covidSheet, REDCAP,
                                          left_on=['Participant Name', 'Particfipant Surname', 'Screening Date',
                                                   'Screening Date', 'Hospital Number', 'Alternative Hospital Number'],
                                          right_on=['Patient Name', 'Patient Surname', 'Date Of Admission',
                                                    'Date Of Sample Collection', 'Hospital Number', 'Hospital Number'])

# Merged_data = pd.merge(REDCAP, covidSheet, how='left',
#                        left_on=['Patient Name', 'Patient Surname'],
#                        right_on=['Participant Name', 'Particfipant Surname'])

# Data_merge.to_csv(r'C:\Users\Selamola\Desktop\PythonThings\FuzzyMatching\DataMacth.csv')

print(Data_merge)

Image of WorkSpace


Tags: csvnameimportnumberdateonmergesurname

热门问题