擅长:python、mysql、java
<p>您可能希望将强大的XPath查询语言与更快的<a href="http://lxml.de/" rel="nofollow">^{<cd1>}</a>模块一起使用。就这么简单:</p>
<pre><code>import urllib2
from lxml import etree
url = 'http://www.thehindu.com/archive/web/2010/06/19/'
html = etree.HTML(urllib2.urlopen(url).read())
for link in html.xpath("//li[@data-section='Business']/a"):
print '{} ({})'.format(link.text, link.attrib['href'])
</code></pre>
<p><strong>更新@data section='Chennai'</strong></p>
^{pr2}$