ExcelLevenshtein算法在列上的应用

2024-10-17 06:16:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试使用Levenshtein算法来查找记录之间的相似性。 我得到的是Item\35;、description、LookUp、Similarity(%)、ReturnSimilarItem列。在列描述中我几乎没有不同的描述,在查找项中,我将具有与描述列中的值类似的值。使用Levenshtein函数,我想确定相似度,如果相似度超过90%,我想从附加到相似项的项列返回值项目。请参阅图片以便更好地描述。 enter image description here 下面是我使用的Levenshtein的代码:

Function Levenshtein3(ByVal string1 As String, ByVal string2 As String) As Long

Dim i As Long, j As Long, string1_length As Long, string2_length As Long
Dim distance(0 To 60, 0 To 50) As Long, smStr1(1 To 60) As Long, smStr2(1 To 
50) As Long
Dim min1 As Long, min2 As Long, min3 As Long, minmin As Long, MaxL As Long

string1_length = Len(string1):  string2_length = Len(string2)

distance(0, 0) = 0
For i = 1 To string1_length:    distance(i, 0) = i: smStr1(i) = 
Asc(LCase(Mid$(string1, i, 1))): Next
For j = 1 To string2_length:    distance(0, j) = j: smStr2(j) = 
Asc(LCase(Mid$(string2, j, 1))): Next
For i = 1 To string1_length
    For j = 1 To string2_length
        If smStr1(i) = smStr2(j) Then
        distance(i, j) = distance(i - 1, j - 1)
        Else
        min1 = distance(i - 1, j) + 1
        min2 = distance(i, j - 1) + 1
        min3 = distance(i - 1, j - 1) + 1
        If min2 < min1 Then
            If min2 < min3 Then minmin = min2 Else minmin = min3
        Else
            If min1 < min3 Then minmin = min1 Else minmin = min3
        End If
        distance(i, j) = minmin
        End If
    Next
Next

' Levenshtein3 will properly return a percent match (100%=exact) based on 
 similarities and Lengths etc...
 MaxL = string1_length: If string2_length > MaxL Then MaxL = string2_length
 Levenshtein3 = 100 - CLng((distance(string1_length, string2_length) * 100)    
/ MaxL)

End Function

我如何使用C3中的查找值并浏览整个B列,一旦它发现相似项超过90%,则返回该描述所附的项?在

在Python on R中这样做容易吗?在


Tags: toforifaslengthlongdistancethen