如何在excel或csv上获取selenium数据?

2024-09-21 03:29:19 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我的全部代码。我想获得csv上的输出数据,如标题、价格,所有内容都将在csv或excel电子表格上分开列。我的代码将转到每个产品的详细信息页面,并收集必要的信息,如产品名称、价格等

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup

#argument for incognito Chrome
option = Options()
option.add_argument("--incognito")


browser = webdriver.Chrome(options=option)

browser.get("https://www.daraz.com.bd/consumer-electronics/?spm=a2a0e.pdp.breadcrumb.1.4d20110bzkC0bn")

# Wait 20 seconds for page to load
timeout = 20
try:
    WebDriverWait(browser, timeout).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='c16H9d']")))
except TimeoutException:
    print("Timed out waiting for page to load")
    browser.quit()

//getting link of each product 
soup = BeautifulSoup(browser.page_source, "html.parser")

product_items = soup.find_all("div", attrs={"data-qa-locator": "product-item"})
for item in product_items:
    item_url = f"https:{item.find('a')['href']}"
    print(item_url)

    browser.get(item_url)

    //scrape details page information 
    itm_soup = BeautifulSoup(browser.page_source, "html.parser")
    container_box = itm_soup.find_all("div",{"id":"container"})
    # Use the itm_soup to find details about the item from its url.
    for itm in container_box:
        product_title_element = itm.find("span",class_="pdp-mod-product-badge-title")
        product_title = product_title_element.get_text() if product_title_element else "No title"
        print(product_title)

browser.quit()

如何在csv或excel电子表格上获取产品标题


Tags: fromimportbrowserurlfortitleseleniumpage
1条回答
网友
1楼 · 发布于 2024-09-21 03:29:19

您可以使用csv编写器模块来完成此操作

from csv import writer
def AddToCSV(List):
    with open("Output.csv", "a+", newline='') as output_file:
        csv_writer = writer(output_file)
        csv_writer.writerow(List)

# this can be used within your for loop
row_list = [item_url, product_title, price, etc..]
AddToCSV(row_list)

相关问题 更多 >

    热门问题