这是我的全部代码。我想获得csv上的输出数据,如标题、价格,所有内容都将在csv或excel电子表格上分开列。我的代码将转到每个产品的详细信息页面,并收集必要的信息,如产品名称、价格等
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup
#argument for incognito Chrome
option = Options()
option.add_argument("--incognito")
browser = webdriver.Chrome(options=option)
browser.get("https://www.daraz.com.bd/consumer-electronics/?spm=a2a0e.pdp.breadcrumb.1.4d20110bzkC0bn")
# Wait 20 seconds for page to load
timeout = 20
try:
WebDriverWait(browser, timeout).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='c16H9d']")))
except TimeoutException:
print("Timed out waiting for page to load")
browser.quit()
//getting link of each product
soup = BeautifulSoup(browser.page_source, "html.parser")
product_items = soup.find_all("div", attrs={"data-qa-locator": "product-item"})
for item in product_items:
item_url = f"https:{item.find('a')['href']}"
print(item_url)
browser.get(item_url)
//scrape details page information
itm_soup = BeautifulSoup(browser.page_source, "html.parser")
container_box = itm_soup.find_all("div",{"id":"container"})
# Use the itm_soup to find details about the item from its url.
for itm in container_box:
product_title_element = itm.find("span",class_="pdp-mod-product-badge-title")
product_title = product_title_element.get_text() if product_title_element else "No title"
print(product_title)
browser.quit()
如何在csv或excel电子表格上获取产品标题
您可以使用csv编写器模块来完成此操作
相关问题 更多 >
编程相关推荐