快速盈利日期ychart.com网站

import requests url = 'https://ycharts.com/companies/AAPL/events/#/?eventTypes=earnings,&pageNum=1' page = requests.get(url) page_content = page.content with open('data.txt', 'w') as f: f.write(page_content) f.close()

2条回答

网友

1楼 · 编辑于 2024-10-03 04:38:32

你对脚本所做的就是从网页中获取HTML。现在您需要解析HTML来获得所需的数据。您可以使用lxml库或beautifulsoup甚至scrapy来进行一些web抓取。在

from lxml import html
import requests

url = 'https://ycharts.com/companies/AAPL/events/#/?eventTypes=earnings,&pageNum=1'

page = requests.get(url)
page_content = page.content

tree = html.fromstring(page_content)
my_xpath = '//th[@class="colDate ng-binding"]/text()'
dates = tree.xpath(my_xpath)

for date in dates:
    print("{}".format(date))

最后你应该在“日期”中列出日期。在

编辑：你没有得到任何执行这个脚本的东西，因为请求.get（）检索HTML而不使用Javascript进行修改，该表由Javascript创建和填充。在

我的答案不适用于这个问题，它只是一个基本的网页抓取脚本。在

网友

2楼 · 编辑于 2024-10-03 04:38:32

要从该页面获取数据，需要将selenium与python结合使用，因为该页面中的数据是动态生成的。但是，要从该页面获取内容，可以执行以下操作：

from selenium import webdriver
from bs4 import BeautifulSoup

driver=webdriver.Chrome()
driver.get("https://ycharts.com/companies/AAPL/events/#/?eventTypes=earnings,&pageNum=1")
soup = BeautifulSoup(driver.page_source,"lxml")
driver.quit()
for item in soup.find_all(class_="colDate"):
    print(item.text)

部分结果：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章