从网站上用大Pandas的请求刮表

res = requests.get("https://coinmunity.co/") soup = BeautifulSoup(res.content, 'lxml') table = soup.find_all('table')[0] dfm = pd.read_html(str(table), header = 0) dfm = dfm[0].dropna(axis=0, thresh=4) dfm.head()

1条回答

网友

1楼 · 发布于 2024-09-29 19:24:58

由于内容是在发出初始请求后动态加载的，因此您将无法通过请求来获取此数据。我要做的是：

from selenium import webdriver
import pandas as pd
import time
from bs4 import BeautifulSoup

driver = webdriver.Firefox()
driver.implicitly_wait(10)
driver.get("https://coinmunity.co/")

html = driver.page_source.encode('utf-8')

soup = BeautifulSoup(html, 'lxml')

results = []
for row in soup.find_all('tr')[2:]:
    data = row.find_all('td')
    name = data[1].find('a').text
    value = data[2].find('p').text
    # get the rest of the data you need about each coin here, then add it to the dictionary that you append to results
    results.append({'name':name, 'value':value})

df = pd.DataFrame(results)

df.head()

name    value
0   NULS    14,005
1   VEN 84,486
2   EDO 20,052
3   CLUB    1,996
4   HSR 8,433

您需要确保安装了geckodriver，并且它在您的路径中。我只是粗略地记下了每枚硬币的名称和价值，但获取其余信息应该很容易。

相关问题更多 >

编程相关推荐

热门问题

热门文章

从网站上用大Pandas的请求刮表

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >