urllib.error.HTTPError:HTTP错误404:未找到(Yahoo Finance)

2024-05-19 03:38:43 发布

您现在位置:Python中文网/ 问答频道 /正文

对于我的计算机项目,我正在尝试制作一个财务预测网站。代码中的一个元素是web抓取API。 它从雅虎金融公司的损益表中获取数据

然而,即使URL是正确的,我仍然不断得到404错误

My code

import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
import warnings
import ssl


ssl._create_default_https_context = ssl._create_unverified_context
income_url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
read_url = ur.urlopen(income_url).read()
income_soup = BeautifulSoup(read_url, 'lxml')

div_list = []
for div in income_soup.find_all('div'):
    div_list.append(div.string)

    if not div.string == div.get('title'):
        div_list.append(div.get('title'))

div_list = [incl for incl in div_list if incl not in
            ('Operating Expenses', 'Non-recurring Events', 'Expand All')]
div_list = list(filter(None, div_list))
div_list = [incl for incl in div_list if not incl.startswith('(function')]
income_list = div_list[13: -5]
income_list.insert(0, 'Breakdown')

income_data = list(zip(*[iter(income_list)]*6))
income_df = pd.DataFrame(income_data)

headers = income_df.iloc[0]
income_df = income_df[1:]
income_df.columns = headers
income_df.set_index('Breakdown', inplace=True, drop=True)

warnings.warn('Amounts are in thousands.')
print(income_df)

我不断地发现这个错误:

urllib.error.HTTPError: HTTP Error 404: Not Found error

如何修复它


Tags: inimportdivurlssldfforread
1条回答
网友
1楼 · 发布于 2024-05-19 03:38:43

通过确保传递用户代理标头,似乎可以解决此问题

使用“请求”模块:

import requests

agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15'
headers = {'User-Agent': agent}
url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
response = requests.get(url, headers=headers)
response.raise_for_status()

相关问题 更多 >

    热门问题