获取超过100天的数据网页抓取耶

import urllib.request as web import bs4 as bs def yahooPrice(tkr): tkr=tkr.upper() url='https://finance.yahoo.com/quote/'+tkr+'/history?p='+tkr sauce=web.urlopen(url) soup=bs.BeautifulSoup(sauce,'lxml') table=soup.find('table') table_rows=table.find_all('tr') allrows=[] for tr in table_rows: td=tr.find_all('td') row=[i.text for i in td] if len(row)==7: allrows.append(row) vixdf= pd.DataFrame(allrows).iloc[0:-1] vixdf.columns=['Date','Open','High','Low','Close','Aclose','Volume'] vixdf.set_index('Date',inplace=True) return vixdf

3条回答

网友

1楼 · 编辑于 2024-09-28 05:40:27

我相信雅虎财经API是在17年5月贬值的。现在，有很多免费下载时间序列数据的选项，至少据我所知。然而，总有一些选择。查看下面的网址，找到一个工具来下载历史价格。在

http://investexcel.net/multiple-stock-quote-downloader-for-excel/

也看看这个。在

https://blog.quandl.com/api-for-stock-data

网友

2楼 · 编辑于 2024-09-28 05:40:27

我没有你的问题的确切答案，但我有一个解决办法（我有同样的问题，因此使用了这个方法）…基本上，你可以使用Bday（）方法-“import”pandas.tseries.offset'并查找收集数据的x个工作日。在我的例子中，我运行了三次循环来获取300个工作日的数据——知道默认情况下，100是我得到的最大值。在

基本上，您将运行循环三次并设置Bday（）方法，以便在第一次迭代时从现在开始获取100天的数据，然后获取下一个100天（从现在起200天）到最后100天（从现在起300天）的数据。使用它的全部意义在于，在任何给定的点上，人们只能获取100天的数据。所以基本上，即使你一次循环了300天，你也可能得不到300天的数据——你原来的问题是（雅虎可能会限制一次性提取的数据量）。我有我的代码：https://github.com/ee07kkr/stock_forex_analysis/tree/dataGathering

注意，在我的例子中，csv文件由于某些原因不能使用/t分隔符…但基本上你可以使用数据帧。我目前还有一个问题是“Volume”是一个字符串而不是float….解决方法是：

苹果=pd.DataFrame.from\u csv('苹果.csv，sep='\t'） apple['Volume']=苹果['Volume']。结构更换（'，'，''）.astype（浮点）

网友

3楼 · 编辑于 2024-09-28 05:40:27

首先，运行下面的代码得到你的100天。然后-使用SQL将数据插入到一个小数据库中（Sqlite3很容易与python一起使用）。最后-修改下面的代码，然后得到每日的价格，你可以增加你的数据库增长。在

from pandas import DataFrame
import bs4
import requests

def function():
    url = 'https://uk.finance.yahoo.com/quote/VOD.L/history?p=VOD.L'
    response = requests.get(url)
    soup=bs4.BeautifulSoup(response.text, 'html.parser')
    headers=soup.find_all('th')
    rows=soup.find_all('tr')
    ts=[[td.getText() for td in rows[i].find_all('td')] for i in range (len(rows))]
    date=[]
    days=(100)
    while days > 0:
        for i in ts:
            data.append (i[:-6])
        now=data[num]
        now=DataFrame(now)
        now=now[0]

        now=str(now[0])
        print now, item
        num=num-1

相关问题更多 >

编程相关推荐

热门问题

热门文章