<p>我试着在<a href="http://en.wikipedia.org/wiki/Python_%28programming_language%29" rel="nofollow">Python</a>中拆分这类行:</p>
<blockquote>
<p>aiburenshi 爱不忍释 "לא מסוגל להינתק, לא יכול להיפרד מדבר מרוב חיבתו אליו" </p>
</blockquote>
<p>这一行包含希伯来语、简体中文和英语。在</p>
<p>例如,如果我有一个元组T,我希望得到的元组是T=(希伯来语字符串、英语字符串、中文字符串)。在</p>
<p>问题是我不知道如何得到希伯来字母的中文Unicode值。这两条线都不起作用:</p>
<pre><code>print ((unicode("释","utf-8")).encode("utf-8"))
print ((unicode("א","utf-8")).encode("utf-8"))
</code></pre>
<p>我得到一个错误:</p>
<blockquote>
<p>SyntaxError: Non-ASCII character '\xe9' in file split_or.py on line 9, but no encoding declared; see <a href="http://www.python.org/peps/pep-0263.html" rel="nofollow">http://www.python.org/peps/pep-0263.html</a> for details</p>
</blockquote>