在请求头中设置模拟位置

2024-10-01 09:30:24 发布

男 | 程序猿一只，喜欢编程写python代码。

我正试图为一个项目刮去沃尔玛的页面。我能够成功地获取他们网页中提供的大型JSON数据集。然而，我注意到，虽然我在中西部，但该位置默认为加利福尼亚州。如何在代码中创建模拟位置？我读过用户代理，但它看起来像是“假装”是一个chrome浏览器。我注意到当我实现Chrome头时，沃尔玛没有考虑我是一个机器人，并根据我的物理位置给了我一个结果。这是一个伟大的第一步，但我想通过纬度和经度或其他方法改变它

我查找了HTTP头字段，但不确定是否找到了设置它的方法。我也可能是以错误的方式看待这一点，因为我是一个新的刮

main.py

import json
import requests
from bs4 import BeautifulSoup

def parseWalmartProductPage(url):
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36',
    }   
    r = requests.get(url, headers=headers)
    
    # parse the html
    soup = BeautifulSoup(r.content, 'html.parser')

    # parse the script tag used to lazy load the page
    scriptJSON = soup.find('script', attrs = {'id':'item'})
    
    # remove the script tags
    for script in soup(["script"]):
        script.extract()

    # get text from the script tag and return it as JSON
    jsonText = scriptJSON.get_text()
    return json.loads(jsonText)

print(parseWalmartProductPage("https://www.walmart.com/ip/Great-Value-Whole-Milk-1-Gallon-128-Fl-Oz/10450114"))

Tags： the 方法 from import json url get script

0条回答

目前没有回答

在请求头中设置模拟位置

相关问题更多 >

编程相关推荐

热门问题

热门文章

在请求头中设置模拟位置

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >