擅长:python、mysql、java
<p>我强烈建议您不要使用regex来解析html,因为<a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags">html is not regular.</a>应该使用类似<a href="http://www.crummy.com/software/BeautifulSoup/" rel="nofollow noreferrer">BeautifulSoup</a>或{a3}之类的html/xml解析器。下面是您尝试使用beauthoulGroup执行的操作的示例:</p>
<pre><code>from bs4 import BeautifulSoup
html = '<p><span class="step_leadin">Step1</span>Carefully transfer the biscuits to a rimmed baking sheet, spacing them an inch or so apart</p>'
bs = BeautifulSoup(html)
for p in bs.find_all('p'):
print p.text
</code></pre>