使用带有python片段的XPath无法获得正确的结果

import scrapy from ah_links.items import AhLinksItem class AhSpider(scrapy.Spider): name = "ah_links" allowed_domains = ["ah.nl"] start_urls=['http://www.ah.nl/producten/aardappel-groente-fruit', ] def parse(self, response): for sel in response.xpath('//ul/li'): item = AhLinksItem() item['title'] = sel.xpath('a/@href').extract() yield item

2条回答

网友

1楼 · 编辑于 2024-10-03 13:27:59

据我所知，您应该在子类别块中搜索列表：

for sel in response.css('nav.subcategorynav li'):
    item = AhLinksItem()
    item['title'] = sel.xpath('.//a/@href').extract()
    yield item

这里我使用的是CSS选择器，但您也可以使用XPath解决它：

response.xpath('//nav[contains(@class, "subcategorynav")]//li')

网友

2楼 · 编辑于 2024-10-03 13:27:59

试试看

item['title'] = sel.xpath("./a/@href").extract()

经过编辑后，这项工作如期进行

import requests
from lxml.html import fromstring
response = requests.get("http://www.ah.nl/producten/aardappel-groente-fruit")
parsed_response = fromstring(response.text)
for item in parsed_response.xpath(".//ul/li"):
    print item.xpath("a/@href")

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用带有python片段的XPath无法获得正确的结果

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >