使用相同的分页链接刮取分页

2条回答

网友

1楼 · 编辑于 2024-09-28 03:17:55

你可以用硒来做这个。下面的脚本将打开网页并转到下一页

import selenium
from selenium import webdriver

driver = webdriver.Chrome()

# navigate to webpage
driver.get('https://www.affarsvarlden.se/bors/kurslistor/stockholm-large/kurs/')

# next button path
next_button = driver.find_element_by_xpath('//*[@id="canvas"]/div[2]/div/div[2]/div/div/div[3]/div[2]/div/div/div[2]/ul/li[4]/a')

# Clicking button throws error the fist time
try:
    next_button.click()
    pass
except Exception:
    next_button.click()

编辑：您的工作目录中需要chromedriver.exe才能使用webdriver

网友

2楼 · 编辑于 2024-09-28 03:17:55

据我所知，所有的数据已经上传到页面时，请求页面。所以，你可以试试这个

from bs4 import BeautifulSoup
from pandas.io.json import json_normalize
import requests
import json

url = 'https://www.affarsvarlden.se/bors/kurslistor/stockholm-large/kurs/'
resp = requests.get(url)
soup = BeautifulSoup(resp.text, 'html.parser')

for tag in soup.findAll('script'):
    content = tag.get_text()

    if '__INITIAL_STATE__' not in content:
        continue

    index = content.find('{')
    data = json.loads(content[index:])
    df = json_normalize(data['stocklist']['stockholm-large/kurs/'], 'info')

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用相同的分页链接刮取分页

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >