<p>此解决方案解决了精确匹配任务(代码复杂度非常高,不建议使用):</p>
<pre><code>#First create a dummy column of Productdetailed which is sorted
df2['dummy'] = df2['Productdetailed'].apply(sorted)
#Create Matching column which stores index of first matched list
df2['Matching'] = np.nan
#Code for finding the exact matches and assigning indices in Matching column
for index1,lst1 in enumerate(df2['dummy']):
for index2,lst2 in enumerate(df2['dummy']):
if index1<index2:
if (lst1 == lst2):
if np.isnan(df2.loc[index2,'Matching']):
df2.loc[index1,'Matching'] = index1
df2.loc[index2,'Matching'] = index1
#Finding the sum of total exact matches
print(df2['Matching'].notnull().sum())
5
#Deleting the dummy column
del df2['dummy']
#Final Dataframe
print(df2)
ID Productdetailed Matching
0 1 [Phone, Watch, Pen] 0.0
1 2 [Pencil, fork, Eraser] 1.0
2 3 [Apple, Mango, Orange] NaN
3 4 [Something, Nothing, Everything] NaN
4 5 [Eraser, fork, Pencil] 1.0
5 6 [Phone, Watch, Pen] 0.0
6 7 [Apple, Mango] NaN
7 8 [Pen, Phone, Watch] 0.0
</code></pre>
<hr/>
<p>对于完全匹配和部分匹配使用(如果至少有2个值匹配,则部分匹配也可以更改):</p>
<pre><code>#First create a dummy column of Productdetailed which is sorted
df2['dummy'] = df2['Productdetailed'].apply(sorted)
#Create Matching column which stores index of first matched list
df2['Matching'] = np.nan
#Create Column Stating Status of Matching
df2['Status'] = 'No Match'
#Code for finding the exact matches and assigning indices in Matching column
for index1,lst1 in enumerate(df2['dummy']):
for index2,lst2 in enumerate(df2['dummy']):
if index1<index2:
if (lst1 == lst2):
if np.isnan(df2.loc[index2,'Matching']):
df2.loc[index1,'Matching'] = index1
df2.loc[index2,'Matching'] = index1
df2.loc[[index1,index2],'Status'] = 'Fully Matched'
else:
count = sum([1 for v1 in lst1 for v2 in lst2 if v1==v2])
if count>=2:
if np.isnan(df2.loc[index2,'Matching']):
df2.loc[index1,'Matching'] = index1
df2.loc[index2,'Matching'] = index1
df2.loc[[index1,index2],'Status'] = 'Partially Matched'
#Finding the sum of total exact matches
print(df2['Matching'].notnull().sum())
7
#Deleting the dummy column
del df2['dummy']
#Final Dataframe
print(df2)
</code></pre>
<hr/>
<pre><code> ID Productdetailed Matching Status
0 1 [Phone, Watch, Pen] 0.0 Fully Matched
1 2 [Pencil, fork, Eraser] 1.0 Fully Matched
2 3 [Apple, Mango, Orange] 2.0 Partially Matched
3 4 [Something, Nothing, Everything] NaN No Match
4 5 [Eraser, fork, Pencil] 1.0 Fully Matched
5 6 [Phone, Watch, Pen] 0.0 Fully Matched
6 7 [Apple, Mango] 2.0 Partially Matched
7 8 [Pen, Phone, Watch] 0.0 Fully Matched
</code></pre>