删除雅虎/誓言新收益日历表

2024-06-30 05:58:31 发布

您现在位置:Python中文网/ 问答频道 /正文

不明白为什么下面修改过的Python脚本不能使用新的收益日历格式。它似乎与href不匹配,这可能与旧格式(动态javascript?)明显不同。在

import datetime
import requests
import bs4
import csv

def get_earning_data(date,date2):
    url = "http://finance.yahoo.com/calendar/earnings?&day={}".format(date)
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0"}
    html = requests.get(url, headers=headers).text
    soup = bs4.BeautifulSoup(html, "html.parser")
    quotes = []
    for tr in soup.find_all("tr"):
        if len(tr.contents) > 3:
            if len(tr.contents[1].contents) > 0:
                if tr.contents[1].contents[0].name == "a":
                    if tr.contents[1].contents[0]["href"].startswith("/quote/"):
                        if "." not in tr.contents[1].contents[0].text: 
                            quotes.append(tr.contents[1].contents[0].text)
                            quotes.append(date2)
    return quotes

outfile = "EarningsCalendar.csv"
open(outfile, 'wb').close
index = 0
while index < 7:
    date = (datetime.date.today() + datetime.timedelta(index)).strftime("%Y-%m-%d")
    date2 = (datetime.date.today() + datetime.timedelta(index)).strftime("%d/%m/%Y")
    mylist = get_earning_data(date,date2)
    print (mylist)
    with open(outfile, 'ab') as csvfile:
        writer = csv.writer(csvfile, delimiter=',',quoting=csv.QUOTE_NONE)
        for i in range(0, len(mylist), 2):
            writer.writerow(mylist[i:i+2])
    index += 1    

以下是2017年5月4日的示例页面源行:

<tr class="data-rowKMX9 Bgc($extraLightBlue):h H(36px) Bgc($altRowColor)" data-reactid="490"><td class="data-col0 Ta(start) Pend(15px) Pstart(6px) W(10%)" data-reactid="491"><a href="/quote/KMX?p=KMX" title="Carmax Inc" data-symbol="KMX" class="Fw(b)" data-reactid="492">KMX</a></td><td class="data-col1 Ta(start) Pend(10px) W(20%)" data-reactid="493">Carmax Inc</td><td class="data-col2 Ta(end) Pstart(15px) W(10%)" data-reactid="494">0.79</td><td class="data-col3 Ta(end) Pstart(15px) W(10%)" data-reactid="495">-</td><td class="data-col4 Ta(end) Pstart(15px) W(10%)" data-reactid="496"><span class="" data-reactid="497">-</span></td><td class="data-col5 Ta(end) Pend(6px) Pstart(15px) W(13%)" data-reactid="498"><span data-reactid="499">Before Market Open</span></td></tr>

下面是一个显示旧格式的示例页:http://web.archive.org/web/20170301070135/https://biz.yahoo.com/research/earncal/today.html

我所能看到的新旧之间唯一的区别就是。开始。使用“http://finance.yahoo.com/quote/”也不起作用。在


Tags: csvimportdatadatetimedateindexifcontents