擅长:python、mysql、java
<p>它比简单的regex <a href="http://en.wikipedia.org/wiki/Ubbi_dubbi" rel="nofollow noreferrer">e.g.,</a>更复杂</p>
<pre><code>"Hi, how are you?" → "Hubi, hubow ubare yubou?"
</code></pre>
<p>简单的regex不会捕捉到<code>e</code>在<code>are</code>中不发音。在</p>
<p>您需要提供发音字典的库,如<code>nltk.corpus.cmudict</code>:</p>
^{pr2}$
<p>示例:</p>
<pre><code>#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re
sent = "Hi, how are you?"
subent = " ".join(["".join(map(spubeak, re.split("(\W+)", nonblank)))
for nonblank in sent.split()])
print('"{}" → "{}"'.format(sent, subent))
</code></pre>
<h3>输出</h3>
<pre>"Hi, how are you?" → "Hubay, hubaw ubar yubuw?"</pre>
<p>注:它与第一个例子不同:每个单词都用它的音节代替。在</p>