我正在尝试从Yahoo Finance的“分析”选项卡中提取股票BABA的“未来5年(每年)”值:https://finance.yahoo.com/quote/BABA/analysis?p=BABA。(从底部算起的第二排为2.85%)
我一直在尝试使用这些问题:
Scrape Yahoo Finance Financial Ratios
Scrape Yahoo Finance Income Statement with Python
但我甚至无法从页面中提取数据
也尝试了此网站:
https://hackernoon.com/scraping-yahoo-finance-data-using-python-ayu3zyl
这是我写的获取网页数据的代码
首先导入包:
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
然后尝试从页面中提取数据:
Url= "https://finance.yahoo.com/quote/BABA/analysis?p=BABA"
r = requests.get(Url)
data = r.text
soup = BeautifulSoup(data,features="lxml")
当查看“数据”和“汤”对象的类型时 我明白了
type(data)
<class 'str'>
我可以使用正则表达式提取“>;未来5年”行所需的数据
但是当你看着
type(soup)
<class 'bs4.BeautifulSoup'>
由于某种原因,其中的数据与页面不相关
看起来是这样的(仅复制了汤对象中的部分内容):
soup
<!DOCTYPE html>
<html class="NoJs featurephone" id="atomic" lang="en-US"><head prefix="og:
http://ogp.me/ns#"><script>window.performance && window.performance.mark &&
window.performance.mark('PageStart');</script><meta charset="utf-8"/>
<title>Alibaba Group Holding Limited (BABA) Analyst Ratings, Estimates &
Forecasts - Yahoo Finance</title><meta con
tent="recommendation,analyst,analyst
rating,strong buy,strong
sell,hold,buy,sell,overweight,underweight,upgrade,downgrade,price target,EPS
estimate,revenue estimate,growth estimate,p/e
estimate,recommendation,analyst,analyst rating,strong buy,strong
sell,hold,buy,sell,overweight,underweight,upgrade,downgrade,price target,EPS
estimate,revenue estimate,growth estimate,p/e estimate" name="keywords"/>
<meta content="on" http-equiv="x-dns-prefetch-control"/><meta content="on"
property="twitter:dnt"/><meta content="90376669494" property="fb:app_id"/>
<meta content="#400090" name="theme-color"/><meta content="width=device-
width,
提前谢谢
一种解决方案是使用正则表达式从JS中的JSON数据中提取值。JSON数据位于以下变量中:
例如:
相关问题 更多 >
编程相关推荐