擅长:python、mysql、java
<p>我强烈建议</em>您可以使用一个<strong>DOM解析器</strong>库,例如<a href="http://pypi.python.org/pypi/lxml" rel="nofollow">lxml</a>和例如<a href="http://pypi.python.org/pypi/cssselect" rel="nofollow">cssselect</a>一起使用。在</p>
<p><strong>示例:</strong></p>
<pre><code>>>> from lxml.html import fromstring
>>> html = """<p class="drug-subtitle"><b>Generic Name:</b> albuterol inhalation (al BYOO ter all)<br><b>Brand Names:</b> <i>Accuneb, ProAir HFA, Proventil, Proventil HFA, ReliOn Ventolin HFA, Ventolin HFA</i></p>"""
>>> doc = fromstring(html)
>>> "".join(filter(None, (e.text for e in doc.cssselect(".drug-subtitle")[0])))
'Generic Name:Brand Names:Accuneb, ProAir HFA, Proventil, Proventil HFA, ReliOn Ventolin HFA, Ventolin HFA'
</code></pre>