如何使用Python从网站获取特定文本？

response = br.open( 'http://www.opensiteexplorer.org/links?site=' + blog) tree = html.fromstring(response.read()) authority = int (tree.xpath('//span[@class="metrics-authority"]/text()')[1].strip()) if authority>1: print blog print 'This blog is ready to be registered' print authority f.write(blog +' '+ str(authority) +'\n')

1条回答

网友

1楼 · 发布于 2024-10-04 01:31:17

您可以使用metrics-authority类获得所有2个跨距，第一个是Domain Authority，第二个是Page Authority。另外，您可以使用id="metrics-page-link-metrics"从div获得{}：

import urllib2
from lxml import html

tree = html.parse(urllib2.urlopen('http://www.opensiteexplorer.org/links?site=www.google.com'))

spans = tree.xpath('//span[@class="metrics-authority"]')
data = [item.text.strip() for item in spans]
print "Domain Authority: {0}, Page Authority: {1}".format(*data)

div = tree.xpath('//div[@id="metrics-page-link-metrics"]//div[@class="has-tooltip"]')[1]
print "Root Domains: {0}".format(div.text.strip())

印刷品：

^{pr2}$

希望有帮助。在

相关问题更多 >

编程相关推荐

热门问题

热门文章