擅长:python、mysql、java
<p>输入来自Xurasky的解决方案以避免403错误</p>
<pre><code>import lxml.html
import lxml.etree
from urllib.request import Request, urlopen
req = Request('http://www.inquirer.net/', headers={'User-Agent': 'Mozilla/5.0'})
r = urlopen(req).read()
html_content = lxml.html.fromstring(r)
root = html_content.xpath('//*[@id="tgs3_info"]/h2')
for a in root:
print(a.text_content())
</code></pre>
<p>输出</p>
^{pr2}$