scrapy重定向到127.0.0.1

import scrapy class BboSpider(scrapy.Spider): name = "bbo" allowed_domains = ["bridgebase.com"] start_urls = [ "http://www.bridgebase.com/vugraph/schedule.php" ] # rules for parsing main response def parse(self, response): filename = 'test.html' with open(filename, 'wb') as f: f.write(response.body)

1条回答

网友

1楼 · 发布于 2024-09-28 17:29:05

您必须提供一个User-Agent头来假装是一个真正的浏览器。在

您可以通过提供headers字典，同时从start_requests()返回{}字典，直接在spider中执行此操作：

import scrapy

class BboSpider(scrapy.Spider):
    name = "bbo"
    allowed_domains = ["bridgebase.com"]

    def start_requests(self):
        yield scrapy.Request("http://www.bridgebase.com/vugraph/schedule.php", headers={
            "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36"
        })

    # rules for parsing main response
    def parse(self, response):
        filename = 'test.html'
        with open(filename, 'wb') as f:
            f.write(response.body)

或者，您可以设置^{} project setting。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

scrapy重定向到127.0.0.1

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >