Python beatuifulsoup：从div类中提取值

import requests as rq from bs4 import BeautifulSoup as bs url = "https://news.guidants.com/#Ticker/Profil/?i=133962&e=74" response = rq.get(url) soup = bs(response.text, "lxml") price = soup.find_all("div", {"class":"left"})[0].find("span") print(price["data-bg_quotepush_c"])

2条回答

网友

1楼 · 编辑于 2024-09-30 16:35:00

如果删除div类的值，请尝试以下示例

driver = webdriver.Chrome(YourPATH to driver)

from bs4 import BeautifulSoup

# create variable to store a url strings
url = 'https://news.guidants.com/#Ticker/Profil/?i=133962&e=74'

driver.get(url)

# scraping proccess

soup = BeautifulSoup(driver.page_source,"html5lib")

# parse
prices = soup.find_all("div", attrs={"class":"left"})

for price in prices:
    total_price = price.find('span')

# close the driver
driver.close()

如果您使用请求模块，请尝试使用不同的解析器您可以使用pip示例html5lib进行安装

pip install html5lib

谢谢

网友

2楼 · 编辑于 2024-09-30 16:35:00

如果使用动态生成的内容，请使用Selenium而不是请求

发生了什么事？

使用requests请求网站只需提供初始内容，该内容不包含所有动态生成的信息，因此您无法找到您要查找的内容

要等待网站完全加载，请使用Selenium和sleep()作为简单方法，或使用selenium waits作为高级方法

避免错误

使用price.text获取元素的文本，如下所示：

<span class="quote quote_standard" data-bg_quotepush="quote" data-bg_quotepush_c="40" data-bg_quotepush_f="quote" data-bg_quotepush_i="133962:74:bid">13.599,24</span>

示例

from selenium import webdriver
from bs4 import BeautifulSoup

url = "https://news.guidants.com/#Ticker/Profil/?i=133962&e=74"

driver = webdriver.Chrome(executable_path=r'C:\Program Files\ChromeDriver\chromedriver.exe')
driver.get(url)
driver.implicitly_wait(3) 

soup = BeautifulSoup(driver.page_source,"html5lib")
price = soup.find_all("div", {"class":"left"})[0].find("span")
print(price.text)
driver.close()

输出

13.599,24

相关问题更多 >

编程相关推荐

热门问题

热门文章