擅长:python、mysql、java
<p>对于更复杂的html/xml解析,您应该看看<code>xpath</code>,它允许非常强大的选择器规则</p>
<p>在python中,它在<code>parsel</code>包中提供</p>
<pre><code>from parsel import Selector
html = '...'
sel = Selector(html)
name = sel.xpath('//strong[1]/text()').get().strip()
email = sel.xpath("//text()[re:test(., 'Email')]/following-sibling::a/text()").get().strip()
name_additional = sel.xpath("//text()[re:test(., 'Additional Contact')]").re("Additional Contact: (.+)")[0]
email_additional = sel.xpath("//text()[re:test(., 'Additional Contact')]/following-sibling::a/text()").get().strip()
print(name, email, name_additional, email_additional)
# Robert Romanoff robert@absol.com William Locantro bill@absol.com
</code></pre>