<p>您有两个选择:</p>
<ol>
<li><p>选择可以处理表情符号代码点的编码。您已使用默认编解码器(取决于您的系统)打开文件进行写入,或者选择了不支持代码点的显式编码</p>
<p>UTF编码可以很好地处理代码点;我在这里选择UTF-8:</p>
<pre><code>with open(filename, 'w', encoding='utf8') as outfile:
outfile.write(yourdata)
</code></pre></li>
<li><p>设置错误处理模式,用替换字符、转义序列替换编解码器无法处理的代码点,或完全忽略它们。请参阅<a href="https://docs.python.org/3/library/functions.html#open" rel="nofollow">^{<cd1>} function</a><code>errors</code>参数:</p>
<blockquote>
<p><em>errors</em> is an optional string that specifies how encoding and decoding errors are to be handled–this cannot be used in binary mode. A variety of standard error handlers are available, though any error handling name that has been registered with <code>codecs.register_error()</code> is also valid. The standard names are:</p>
<ul>
<li><code>'strict'</code> to raise a <code>ValueError</code> exception if there is an encoding error. The default value of <code>None</code> has the same effect.</li>
<li><code>'ignore'</code> ignores errors. Note that ignoring encoding errors can lead to data loss.</li>
<li><code>'replace'</code> causes a replacement marker (such as <code>'?'</code>) to be inserted where there is malformed data.</li>
<li><code>'surrogateescape'</code> will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the <code>surrogateescape</code> error handler is used when writing data. This is useful for processing files in an unknown encoding.</li>
<li><code>'xmlcharrefreplace'</code> is only supported when writing to a file. Characters not supported by the encoding are replaced with the appropriate XML character reference <code>&#nnn;</code>.</li>
<li><code>'backslashreplace'</code> (also only supported when writing) replaces unsupported characters with Python’s backslashed escape sequences.</li>
</ul>
</blockquote>
<p>因此,使用<code>errors='ignore'</code>打开文件将<em>不会写入表情符号代码点</em>,而不会引发错误:</p>
<pre><code>with open(filename, 'w', errors='ignore') as outfile:
outfile.write(yourdata)
</code></pre></li>
</ol>
<p>演示:</p>
<pre><code>>>> a_ok = 'The U+1F44C OK HAND SIGN codepoint: \U0001F44C'
>>> print(a_ok)
The U+1F44C OK HAND SIGN codepoint: 👌
>>> a_ok.encode('utf8')
b'The U+1F44C OK HAND SIGN codepoint: \xf0\x9f\x91\x8c'
>>> a_ok.encode('cp1251', errors='ignore')
b'The U+1F44C OK HAND SIGN codepoint: '
>>> a_ok.encode('cp1251', errors='replace')
b'The U+1F44C OK HAND SIGN codepoint: ?'
>>> a_ok.encode('cp1251', errors='xmlcharrefreplace')
b'The U+1F44C OK HAND SIGN codepoint: &#128076;'
>>> a_ok.encode('cp1251', errors='backslashreplace')
b'The U+1F44C OK HAND SIGN codepoint: \\U0001f44c'
</code></pre>
<p>请注意<code>'surrogateescape'</code>选项的空间有限,仅在<em>解码</em>未知编码的文件时才真正有用;它在任何情况下都无法处理表情符号</p>