擅长:python、mysql、java
<p>我使用xpath</strong>:<code>td/a[not(contains(.,"spam"))]/@href | td[not(a)]/text()</code></p>
<pre><code>$ python3
>>> import lxml.html
>>> doc = lxml.html.parse('data.xml')
>>> [[j for j in i.xpath('td/a[not(contains(.,"spam"))]/@href | td[not(a)]/text()')] for i in doc.xpath('//tr')]
[['website1.com', 'info1', 'info2'], ['website2.com', 'info1', 'info2']]
</code></pre>