擅长:python、mysql、java
<p>使用<a href="http://www.amk.ca/python/howto/unicode" rel="noreferrer">unicode</a>字符串。使用<a href="http://docs.python.org/library/re.html#re.UNICODE" rel="noreferrer">re.UNICODE</a>标志。</p>
<pre><code>>>> myre = re.compile(ur'[\u064B-\u0652\u06D4\u0670\u0674\u06D5-\u06ED]+',
re.UNICODE)
>>> myre
<_sre.SRE_Pattern object at 0xb20b378>
>>> mystr = u'بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ'
>>> result = myre.sub('', mystr)
>>> len(mystr), len(result)
(38, 22)
>>> print result
بسم الله الرحمن الرحيم
</code></pre>
<p>阅读Joel Spolsky的文章,这篇文章叫做<a href="http://www.joelonsoftware.com/articles/Unicode.html" rel="noreferrer">The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)</a></p>