在过去的两周里,我一直在努力解决这个问题,我几乎达到了目标
案例:Overall depiction of what i am trying
示例:假设我有单元格X1,我匹配它,Y(1,2,3)中的每个单元格 X1与Y3最匹配
更新了我的信息:
此代码能够与sequencematcher匹配并打印匹配,但是我只获得一个输出匹配,而不是最大匹配的列表:
import pandas as pd
from difflib import SequenceMatcher
data1 = {'Fruit': ['Apple','Pear','mango','Pinapple'],
'nr1': [22000,25000,27000,35000],
'nr2': [1,2,3,4]}
data2 = {'Fruit': ['Apple','Pear','mango','Pinapple'],
'nr1': [22000,25000,27000,35000],
'nr2': [1,2,3,4]}
df1 = pd.DataFrame(data1, columns = ['Fruit', 'nr1', 'nr2'])
df2 = pd.DataFrame(data2, columns = ['nr1','Fruit', 'nr2'])
#Single out specefic columns to match
col1=(df1.iloc[:,[0]])
col2=(df2.iloc[:,[1]])
#function to match 2 values similarity
def similar(a,b):
ratio = SequenceMatcher(None, a, b).ratio()
matches = a, b
return ratio, matches
for i in col1:
print(max(similar(i,j) for j in col2))
产量:(1.0,(‘水果’、‘水果’))
我如何修复,以便它将为我提供所有最大匹配,以及我如何提取匹配所在的相应行
这应该起作用:
相关问题 更多 >
编程相关推荐