拉网帽

2024-09-29 17:13:36 发布

您现在位置:Python中文网/ 问答频道 /正文

由于coinmarketcap api计划的历史数据的限制,我正在寻求webscrape

然而,尽管阅读了关于属性的糟糕文档,我还是被困在了第一个障碍上

import json 
import requests 
from bs4 import BeautifulSoup


r = requests.get('https://coinmarketcap.com/historical/20210905/')
soup = BeautifulSoup(r.text, 'lxml')
print(soup)

输出中包含的数据是我正试图获取的数据。我试图获取的数据:

2021年9月5日BTC的市值、价格和流通供应

数据在<script id="__NEXT_DATA__" type="application/json">之后不久出现在输出中,因此我认为使用__NEXT_DATA__作为属性id将允许我访问数据。不幸的是没有

包含数据的数据结构示例如下所示:

"listingHistorical":{"data":[{"id":1,"name":"Bitcoin","symbol":"BTC","slug":"bitcoin","num_market_pairs":8848,"date_added":"2013-04-28T00:00:00.000Z","tags":["mineable","pow","sha-256","store-of-value","state-channels","coinbase-ventures-portfolio","three-arrows-capital-portfolio","polychain-capital-portfolio","binance-labs-portfolio","arrington-xrp-capital","blockchain-capital-portfolio","boostvc-portfolio","cms-holdings-portfolio","dcg-portfolio","dragonfly-capital-portfolio","electric-capital-portfolio","fabric-ventures-portfolio","framework-ventures","galaxy-digital-portfolio","huobi-capital","alameda-research-portfolio","a16z-portfolio","1confirmation-portfolio","winklevoss-capital","usv-portfolio","placeholder-ventures-portfolio","pantera-capital-portfolio","multicoin-capital-portfolio","paradigm-xzy-screener"],"max_supply":21000000,"circulating_supply":18807550,"total_supply":18807550,"platform":null,"cmc_rank":1,"last_updated":"2021-09-05T23:00:00.000Z","quote":{"BTC":{"price":1,"volume_24h":585906.8067215424,"percent_change_1h":0,"percent_change_24h":0,"percent_change_7d":0,"market_cap":18807550,"fully_diluted_market_cap":null,"last_updated":"2021-09-05T23:59:03.000Z"},"USD":{"price":51753.41192620951,"volume_24h":30322676318.63,"percent_change_1h":-0.159917099159,"percent_change_24h":3.621580803777,"percent_change_7d":5.987281074996,"market_cap":973354882472.7817,"last_updated":"2021-09-05T23:00:00.000Z"}},"rank":1,"noLazyLoad":true},

有没有简单的解决办法


Tags: 数据importidchangemarketcaplastbtc
2条回答

您可以尝试以下方法:

r = requests.get('https://coinmarketcap.com/historical/20210905/')
soup = BeautifulSoup(r.text)

data = json.loads(soup.find('script', type='application/ld+json', id='__NEXT_DATA__').text)

historical_data = data['listingHistorical']['data']
print historical_data

这仅适用于列表表,该表已在页面上完全加载

https://coinmarketcap.com/historical/20210905/->;20210905->;2021-09-05是日期,只需替换为所需的日期,它将显示数据https://coinmarketcap.com/historical/20210101/,例如,然后刮取并提取JSON数据

相关问题 更多 >

    热门问题