我正在尝试使用xpath收集一堆链接,这些链接需要从下一页抓取,但是,我一直得到的错误是只能解析字符串?我试着看一下lk的类型,在我铸造之后它是一根弦?怎么了?在
def unicode_to_string(types):
try:
types = unicodedata.normalize("NFKD", types).encode('ascii', 'ignore')
return types
except:
return types
def getData():
req = "http://analytical360.com/access-points"
page = urllib2.urlopen(req)
tree = etree.HTML(page.read())
i = 0
for lk in tree.xpath('//a[@class="sabai-file sabai-file-image sabai-file-type-jpg "]//@href'):
print "Scraping Vendor #" + str(i)
trees = etree.HTML(urllib2.urlopen(unicode_to_string(lk)))
for ll in trees.xpath('//table[@id="archived"]//tr//td//a//@href'):
final = etree.HTML(urllib2.urlopen(unicode_to_string(ll)))
您应该传入字符串,而不是urllib2.orlopen。在
或许可以这样更改代码:
而且,您似乎没有增加
i
。在相关问题 更多 >
编程相关推荐