缺少价格的Web浏览表单Web浏览器

2024-10-04 01:36:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我得到了一些帮助,但还需要一些帮助

我能够从下面的脚本中获取一些信息,但我缺少的是价格。该网站需要一个邮政编码(说B3K 1X2),我把它放在网站上,然后我可以看到产品的价格。我将保持该页面的打开状态,并打开一个新页面,该页面将为我提供正确的产品定价。当我运行代码时,我得到以下输出。文本“无”应该是价格。我已将时间添加到60秒,以允许页面加载所有项目。我错过什么了吗

SKU 、025/BR258L40每100(右)、品牌、坎布罗、计量单位、EA无

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

import requests
from bs4 import BeautifulSoup
import time

driver = webdriver.Chrome(executable_path = "D:\chromedriver\chromedriver.exe")
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get("https://www.russellhendrix.com/category/185/cooking-equipment?pagesize=600")

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.entity-product-price-wrap.grid-item-price-wrap"))).click()

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "input#gv-postcalcode"))).send_keys("B3K 1X2")

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.gv-red-btn.gv-set-postal"))).click()

wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR, "a.gv-red-btn.gv-set-postal")))

time.sleep(60)  # delays start of scrape for 60 secords for page to load

headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
}
url = 'https://www.russellhendrix.com/category/185/cooking-equipment'
r = requests.get(url, headers)
soup = BeautifulSoup(r.content, 'lxml')





baseurl = 'https://www.russellhendrix.com'

headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
}

url = 'https://www.russellhendrix.com/category/185/cooking-equipment'

r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')

productlist = soup.find_all('div', class_='entity-product-image-wrap')

productlinks = []

for item in productlist:
    for link in item.find_all('a', href=True):
        productlinks.append(baseurl + link['href'])

for link in productlinks:
    r = requests.get(link, headers=headers)

    soup = BeautifulSoup(r.content, 'lxml')

    skunumber = soup.find('table', class_='product-details-table').text
    pricing = soup.find('div', class_='regPriceValue')

    print(skunumber, pricing)

enter image description here

个人产品信息 https://www.russellhendrix.com/product/15388/cambro-camrack-base-rack-40p-172107a-br258l40per100rh

enter image description hereenter image description here


Tags: offromhttpsimportcomforbywww
1条回答
网友
1楼 · 发布于 2024-10-04 01:36:35

正如我在几个产品链接上看到的,手动打开链接的方式与您使用的方式类似

for link in productlinks:

有几种产品根本没有价格。
没有与('div', class_='gv-price')定位器匹配的元素。
因此soup.find('div', class_='gv-price')将返回None,因为找不到元素。
UPD
这是插入“B3K 1X2”邮政编码以获取产品价格的硒代码:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(executable_path = driver_path)
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get(url)

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.entity-product-price-wrap.grid-item-price-wrap"))).click()

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "input#gv-postcalcode"))).send_keys("B3K 1X2")

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.gv-red-btn.gv-set-postal"))).click()

wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR, "a.gv-red-btn.gv-set-postal")))

从这里你可以继续你的刮

相关问题 更多 >