我是新来的刮,我想刮一个汽车经销商的网站
该网站使用ajax无限滚动。我在网上找到了一些无限滚动示例,但它们都是从json文件中提取数据的,但在这种情况下,网站没有使用json(我可能错了)
我只能从?page=1中刮取标题,但它的页数为?page=8,页数可以根据库存车辆的数量而变化
该站点在a<;ul>
这是网站:https://www.marlboroughford.com/inventory
<ul class="pagination pagination-sm" data-url="https://www.marlboroughford.com/inventory?page=d"> <li id="il-pagination-element-1" class="">
<a href="https://www.marlboroughford.com/inventory">1</a>
</li>
<li id="il-pagination-element-2" class="">
<a href="https://www.marlboroughford.com/inventory?page=2">2</a>
</li>
<li id="il-pagination-element-3" class="">
<a href="https://www.marlboroughford.com/inventory?page=3">3</a>
</li>
<li><span>...</span></li>
<li id="il-pagination-element-8">
<a href="https://www.marlboroughford.com/inventory?page=8">8</a>
</li>
</ul>
import scrapy
class DealerSpider(scrapy.Spider):
name = "cars"
start_urls = [
'https://www.marlboroughford.com/inventory?page=',
]
def parse(self, response):
yield {
'title': response.xpath('/html/body/div[1]/main/div/div/div/div/div/div/div/div/div/div/meta[1]/@content').extract()
}
这是我的工作
相关问题 更多 >
编程相关推荐