<p>如果您使用本机文本编辑器打开文件,并且看起来很好,那么问题可能是您的其他程序没有正确地检测到编码并<a href="https://en.wikipedia.org/wiki/Mojibake" rel="nofollow noreferrer">mojibaking</a>。正如评论中提到的,几乎可以肯定的是一个看起来像<code>'</code>但不是的<a href="https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html" rel="nofollow noreferrer">Unicode quote character</a></p>
<pre><code>my_string = ('The Knights who say '
'\N{LEFT SINGLE QUOTATION MARK}'
'Ni!'
'\N{RIGHT SINGLE QUOTATION MARK}'
)
def print_repr_escaped(x):
print(repr(x.encode('unicode_escape').decode('ascii')))
print_repr_escaped(my_string)
# 'The Knights who say \\u2018Ni!\\u2019'
</code></pre>
<p>如果无法控制其他程序的编码,则有两个选项:</p>
<ol>
<li><p>删除所有Unicode字符<a href="https://stackoverflow.com/questions/15321138/removing-unicode-u2026-like-characters-in-a-string-in-python2-7">like so</a>:</p>
<pre><code>stripped = my_string.encode('ascii', 'ignore').decode('ascii')
print_repr_escaped(stripped)
# 'The Knights who say Ni!'
</code></pre></li>
<li><p>尝试使用<a href="https://pypi.python.org/pypi/Unidecode" rel="nofollow noreferrer">Unidecode</a>之类的代码将Unicode字符转换为ASCII</p>
<pre><code>import unidecode
converted = unidecode.unidecode(my_string)
print_repr_escaped(converted)
# "The Knights who say 'Ni!'"
</code></pre></li>
</ol>