需要帮助从一个固定的网址和动态加载内容的网站刮酒店名单？

2024-07-02 12:15:49 发布

男 | 程序猿一只，喜欢编程写python代码。

我正试图从一个酒店列表网站this site上搜集细节。在这里，当我们为下一页单击next按钮时，url保持不变，当使用inspect元素查看时，站点正在发送XHR请求。我尝试使用seleniumwebdriver和python，下面是我的代码

from time import sleep
import scrapy
from selenium import webdriver
from scrapy.selector import Selector
from scrapy.http import Request
from selenium.common.exceptions import NoSuchElementException

class DineoutRestaurantSpider(scrapy.Spider):
    name = 'dineout_restaurant'
    allowed_domains = ['dineout.co.in/bangalore-restaurants?search_str=']
    start_urls = ['http://dineout.co.in/bangalore-restaurants?search_str=']
    def start_requests(self):
        self.driver = webdriver.Chrome('/Users/macbookpro/Downloads/chromedriver')
        self.driver.get('https://www.dineout.co.in/bangalore-restaurants?search_str=')'

url = 'https://www.dineout.co.in/bangalore-restaurants?search_str='
**yield Request(url, callback=self.parse)**
self.logger.info('Empty message')

for i in range(1, 4):
    try:
        next_page = self.driver.find_element_by_xpath('//a[text()="Next "]')
        sleep(11)
        self.logger.info('Sleeping for 11 seconds.')
        next_page.click()
        url = 'https://www.dineout.co.in/bangalore-restaurants?search_str='
        yield Request(url, callback=self.parse)

    except NoSuchElementException:
        self.logger.info('No more pages to load.')
        self.driver.quit()
        break

def parse(self, response):
self.logger.info('Entered parse method')
restaurants = response.xpath('//*[@class="cardBg"]')
for restaurant in restaurants:
     name = restaurant.xpath('.//*[@class="titleDiv"]/h4/a/text()').extract_first()
     location = restaurant.xpath('.//*[@class="location"]/a/text()').extract()
     rating = restaurant.xpath('.//*[@class="rating rating-5"]/a/span/text()').extract_first()
     yield{
            'Name': name,
            'Location': location,
            'Rating': rating,
            }`

在上面的代码中，yield请求没有转到parse函数？我遗漏了什么吗？我没有得到任何错误，但scrape输出只是第1页，即使页面正在迭代

Tags： in from import self url search parse restaurant

0条回答

目前没有回答

需要帮助从一个固定的网址和动态加载内容的网站刮酒店名单？

相关问题更多 >

编程相关推荐

热门问题

热门文章

需要帮助从一个固定的网址和动态加载内容的网站刮酒店名单？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >