擅长:python、mysql、java
<p>在我确保所有列都是小写的情况下(为了更好地度量,我还删除了连字符和括号),这种方法就奏效了:</p>
<pre><code>print("All lowercase")
data = data.apply(lambda x: x.astype(str).str.lower())
categories = categories.apply(lambda x: x.astype(str).str.lower())
print("Remove double spacing")
data = data.replace('\s+', ' ', regex=True)
print('Remove hyphens')
data["RepairName"] = data["RepairName"].str.replace('-', '')
print('Remove brackets')
data["RepairName"] = data["RepairName"].str.replace('(', '')
data["RepairName"] = data["RepairName"].str.replace(')', '')
data['Category'] = [
next((c for c, k in categories.values if k in s), None) for s in data['RepairName']]
</code></pre>