擅长:python、mysql、java
<p>除了Achille Huet链接<a href="https://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-normalize-in-a-python-unicode-string">this question</a>的评论之外,您还可以在pandas dataframe列上使用以下内容:</p>
<pre><code>import unidecode
df['col A'] = df['col A'].apply(lambda x: unidecode.unidecode(x))
</code></pre>
<p>或</p>
<pre><code>import unidecode
for col in df.columns:
df[col]=df[col].apply(lambda x: unidecode.unidecode(x))
</code></pre>
<p>但是,由于您已经创建了特殊字符词典,因此可以使用它:</p>
<p>只需通过传递<code>regex=True</code>来创建一个字典<code>special_chars</code>和<code>replace</code>整个数据帧上的值。这也应该更快。我不知道是否有一个更快的解决方案使用unicode。这也取决于你用它做什么。例如,如果发送到.csv文件,我相信<code>to_csv()</code>中也有一个参数,但我不确定这是否相关:</p>
<pre><code>special_chars = {"ä":"a","ç":"c","è":"e","º":"","Ã":"A","Í":"I","í":"i","Ü":"U","â":"a","ò":"o","¿":"",
"ó":"o","á":"a","à":"a","õ":"o","¡":"","Ó":"O","ù":"u","Ú":"U","´":"","Ñ":"N",
"Ò":"O","ï":"i","Ï":"I","Ç":"C","À":"A","É":"E","ë":"e","Á":"A","ã":"a","Ö":"O",
"ú":"u","ñ":"n","é":"e","ê":"e","·":"-","ª":"a","°":"","ü":"u","ô":"o"}
df.replace(special_chars, regex=True)
</code></pre>