BeautifulSoup找不到'class'href

2024-10-05 13:20:54 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我正在努力抓取的页面: https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2

我想刮去令牌标签旁边的lisbox href

我需要从你的脸上刮下来

class="link-hover d-flex justify-content-between align-items-center"

所以我的代码是:

import requests
from bs4 import BeautifulSoup

page = requests.get('https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2').text
html = BeautifulSoup(page, 'html.parser')

href = html.find(class_ = 'link-hover d-flex justify-content-between align-items-center')['href']

然而,结果是什么都没有。 有人能帮我吗


Tags: httpsioaddresshtmllinkitemscontentbetween
1条回答
网友
1楼 · 发布于 2024-10-05 13:20:54

感兴趣的元素由JavaScript呈现。因此,您需要一些浏览器自动化软件来呈现JavaScript,以便获得所需的完整HTML

注意:您可以使用支持JavaScript呈现的requests-html。然而,它本身确实使用了浏览器自动化软件,因此,在我看来,最好摆脱“中间人”


from selenium import webdriver

browser = webdriver.Firefox()
browser.get("https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2")
elem = browser.find_element_by_id("availableBalanceDropdown")
elem.click()
soup = bs4.BeautifulSoup(browser.page_content(), features="html.parser")

剧作家

from playwright.sync_api import sync_playwright

with sync_playwright() as play:
    browser = play.chromium.launch()
    page = browser.new_page()
    page.goto("https://etherscan.io/address/0xCcE984c41630878b91E20c416dA3F308855E87E2")
    page.click("#availableBalanceDropdown")
    soup = bs4.BeautifulSoup(page.content(), features="html.parser")
    browser.quit()

一旦有了bs4.BeautifulSoup对象,就只需要对CSS选择器进行刮取

import bs4

soup = bs4.BeautifulSoup(...)    # From above examples
elems = soup.select(".link-hover.d-flex.justify-content-between.align-items-center")

相关问题 更多 >

    热门问题