刮取多个页面并将结果放入一个CSV

2024-09-28 05:27:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我是个编程新手,从python开始。我用它从网站,网上商店抓取数据。我想刮取结果页的每一页(分页),并将结果url放在一个csv中

这就是我一直在尝试的

import selenium
import bs4
from selenium import webdriver
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

myurl = 'https://www.tokopedia.com/p/rumah-tangga/alat-pertukangan/obeng?keyword=obeng&page='
chrome_path = '/home/yoga/Downloads/chromedriver'
driver = webdriver.Chrome(chrome_path)

#opening webpage
for number in range(10):
    buka = driver.get(myurl + str(number))

page_source = driver.page_source
soup_this = soup(page_source, "html.parser")
product_links = soup_this.findAll("div",{"class":"product-summary"})

for number2 in range(10):
    filename = "tokopedia" + str(number2) + ".csv"
f = open(filename, "w")
headers = "Link" + "\n"
f.write(headers)

for product in product_links:
    barang = product.a["ng-href"]
    print(barang + "\n")
    f.write(barang + "\n")

f.close()
driver.close()

我在csv里得到的结果只有一页。你们能帮帮我吗?在


Tags: csvinfromimportsourceforasdriver
1条回答
网友
1楼 · 发布于 2024-09-28 05:27:15
import selenium
import bs4
from selenium import webdriver
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

myurl = 'https://www.tokopedia.com/p/rumah-tangga/alat-pertukangan/obeng?keyword=obeng&page='
chrome_path = '/home/yoga/Downloads/chromedriver'
driver = webdriver.Chrome(chrome_path)

filename = "tokopedia.csv"
f = open(filename, "w")
#opening webpage
for number in range(10):
    buka = driver.get(myurl + str(number))

    page_source = driver.page_source
    soup_this = soup(page_source, "html.parser")
    product_links = soup_this.findAll("div",{"class":"product-summary"})


    headers = "Link" + "\n"
    f.write(headers)

    for product in product_links:
        barang = product.a["ng-href"]
        print(barang + "\n")
        f.write(barang + "\n")

f.close()
driver.close()

相关问题 更多 >

    热门问题