Python Yahoo Finance Web Scraper“错误:'INT'对象不可订阅”

2024-10-01 13:46:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在构建一个基本的抓取应用程序,从Yahoo Finance中提取公司财务报表数据。使用Requests和beautifulsoupv4提取数据,以获得相关值的列表。然后我想在这个列表中插入项目,这样条目就可以整齐地放入一个矩阵中(每个数据类型行有5列)

然而,在尝试在没有提取数据的点中插入空白数字的过程中,我反复收到以下错误:“错误:'INT'对象不可下标”。谁能帮我解释一下为什么会这样,或者我遗漏了什么

非常感谢

import requests
import bs4

#Obtain full company ticker, and while loop to verify that ticker produces result

while True:
    
    try:
        
        ticker = input("Please insert the Ticker for the relevant company from Yahoo Finance: ")
        exchange = input("Please insert the Exchange for the relevant company from Yahoo Finance: ")
        test_page = f'https://uk.finance.yahoo.com/quote/{ticker}.{exchange}/financials?p={ticker}.{exchange}'
        result = requests.get(test_page)
        soup = bs4.BeautifulSoup(result.text,'lxml')
        title = str(soup.find('title'))
        if 'income statement' in title:
            print('Success. Ticker found.')
            break
            
    except:
        pass
    
    print("Error. Ticker not found. Please try again.")

#Define webpages

income_statement =  f'https://uk.finance.yahoo.com/quote/{ticker}.{exchange}/financials?p={ticker}.{exchange}'
balance_sheet = f'https://uk.finance.yahoo.com/quote/{ticker}.{exchange}/balance-sheet?p={ticker}.{exchange}'
cash_flow = f'https://uk.finance.yahoo.com/quote/{ticker}.{exchange}/cash-flow?p={ticker}.{exchange}'

#Extract Income Statement data

result = requests.get(income_statement)

soup = bs4.BeautifulSoup(result.text,"lxml")

#Calculate Total Number of Div Span Entries Extracted

Counter = 0

for item in soup.select('div span'):
    if soup.select('div span').index(item) > Counter:
        Counter = soup.select('div span').index(item)

#Calculate Range of Relevant Entries and Number of Columns in Table

for a in list(range(1,Counter)):
    if soup.select('div span')[a].text == 'Breakdown':
        Start = a
    if soup.select('div span')[a].text == 'Total revenue':
        ColumnNo = a - Start
    if soup.select('div span')[a].text == 'Net Income Common Stockholders' or soup.select('div span')[a].text == 'Net income available to common shareholders':
        End = a + ColumnNo

#Convert range of relevant entries into list of strings

mylist = []
for item in soup.select('div span')[Start:End]:
    mylist.append(item.text)

#Prepare a balanced list of data to fit in a 5 column table. Inserts 0 into list where blank spaces have not been recognised.

balanced_list = mylist #To preserve the extracted list (mylist) and prepare the list for insertion 

numbers = ['0','1','2','3','4','5','6','7','8','9']
entrycounter = 1
missing_figures_dict = {2:4,3:3,4:2,0:1} #Ratio is entrycounter%5 : missing entries

for item in balanced_list[ColumnNo:-1]:
    if str(item[0]) not in numbers and (entrycounter % 5) != 1:
        missing_figures = missing_figures_dict[entrycounter % 5]
        for missingfigure in range(1,missing_figures):
            balanced_list.insert((entrycounter+missingfigure-1+5), 0)
        entrycounter = entrycounter + missing_figures
    else:
        entrycounter = entrycounter + 1

Tags: ofthetextindivforifexchange