回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有这两本字典</p>
<pre><code>dico = {'Name': ['Arthur','Henri','Lisiane','Patrice','Zadig','Sacha'],
"Age": ["20","18","62","73",'21','20'],
"Studies": ['Economics','Maths','Psychology','Medical','Cinema','CS']
}
dico2 = {'Surname': ['Arthur1','Henri2','Lisiane3','Patrice4']}
dico = pd.DataFrame.from_dict(dico)
dico2 = pd.DataFrame.from_dict(dico2)
</code></pre>
<p>在其中,我想匹配,然后将姓氏列附加到Name列,最后将其附加到dico,以获得以下输出:</p>
<pre><code> Name Surname Age Studies
0 Arthur Arthur1 20 Economics
1 Henri Henri2 18 Maths
2 Lisiane Lisiane3 62 Psychology
3 Patrice Nan 73 Medical
4 Zadig Nan 21 Cinema
5 Sacha Nan 20 CS
</code></pre>
<p>并最终删除姓氏为<code>Nan</code>的行</p>
<pre><code> Name Surname Age Studies
0 Arthur Arthur1 20 Economics
1 Henri Henri2 18 Maths
2 Lisiane Lisiane3 62 Psychology
</code></pre>
<pre><code>map_list = []
for name in dico['Name']:
best_ratio = None
for idx, surname in enumerate(dico2['Surname']):
if best_ratio == None:
best_ratio = fuzz.ratio(name, surname)
best_idx = 0
else:
ratio = fuzz.ratio(name, surname)
if ratio > best_ratio:
best_ratio = ratio
best_idx = idx
map_list.append(dico2['Surname'][best_idx]) # obtain surname
dico['Surname'] = pd.Series(map_list) # add column
dico = dico[["Name", "Surname", "Age", "Studies"]] # reorder columns
#if the surname is not a great match, print "Nan"
dico = dico.drop(dico[dico.Surname == "NaN"].index)
</code></pre>
<p>但是当I<code>print(dico)</code>时,输出如下:</p>
<pre><code> Name Surname Age Studies
0 Arthur Arthur1 20 Economics
1 Henri Henri2 18 Maths
2 Lisiane Lisiane3 62 Psychology
3 Patrice Patrice4 73 Medical
4 Zadig Patrice4 21 Cinema
5 Sacha Patrice4 20 CS
</code></pre>
<p>我不明白为什么在Patrice争吵之后会出现不匹配,而我希望它是“Nan”</p>