擅长:python、mysql、java
<p>在处理文本时,显然必须考虑regex。</p>
<pre><code>import re
text = text = ('<p><span class="newStyle0" '
'style="left: 291px; '
'top: 258px">000</span></p> <p>'
'<span class="newStyle1" '
'style="left: 85px; '
'top: 200px">001</span></p> <p>'
'<span class="newStyle2" '
'style="left: 580px; '
'top: 400px; width: 167px; '
'height: 97px">002</span></p> <p>'
'<span class="newStyle3" '
'style="left: 375px; top: 165px">'
'003</span></p>')
words = ['XXX-%04d-YYY' % a for a in xrange(1000)]
regx = re.compile('(?<=>)\d+(?=</span>)')
def gv(m,words = words):
return words[int(m.group())]
print regx.sub(gv,text)
</code></pre>