迭代时的索引器

2024-06-18 11:09:00 发布

您现在位置:Python中文网/ 问答频道 /正文

迭代时出现IndexError问题。该程序运行良好,直到一切都完成,没有更多的“子网站”去,然后它崩溃,正因为如此,它是不可能保存在.txt

回溯(最近一次呼叫最后一次)

newUrl = nextpage[counter]['href']
IndexError: list index out of range

代码

from urllib.request import urlopen, Request
from bs4 import BeautifulSoup
import json
class Olx():

    def __init__(self, url):
        self.url = url

    def getPrice(self):
        """Get prices from olx"""
        html = urlopen(self.url)
        bs = BeautifulSoup(html, 'html.parser')
        price = bs.findAll('p', class_='price')
        return price

    def nextPage(self):
        """Go to the next page"""
        html = urlopen(self.url)
        bs = BeautifulSoup(html, 'html.parser')
        pageButton = bs.findAll('a', {'class': 'block br3 brc8 large tdnone lheight24'})
        try:
            return pageButton
        except AttributeError:
            None
        else:
            return pageButton

    

olxprices = Olx('https://www.olx.pl/nieruchomosci/mieszkania/wynajem/olsztyn/').getPrice()
nextpage = Olx('https://www.olx.pl/nieruchomosci/mieszkania/wynajem/olsztyn/').nextPage()
counter = 0

output = []
while len(nextpage) > 0:
    for price in olxprices:
        output.append(price.get_text().strip())
        print(price.get_text().strip())
    newUrl = nextpage[counter]['href']
    olxprices = Olx(newUrl).getPrice()
    counter += 1

print(output)

Tags: fromimportselfurlbsdefhtmlcounter
2条回答

您可以尝试使用异常

while len(nextpage) > 0:
    try:
        for price in olxprices:
            output.append(price.get_text().strip())
            print(price.get_text().strip())
        newUrl = nextpage[counter]['href']
        olxprices = Olx(newUrl).getPrice()
        counter += 1
    except IndexError:
        break    

(或者做任何你想做的事情作为例外) 如果这不能回答您的问题,可能是因为页面的长度保持不变,所以您可能也希望遍历它

len(nextpage)永远不会改变,因此while循环永远不会结束,并且最终counter索引会超过nextpage的结尾。相反,请执行以下操作:

for page in nextpage:
    for price in olxprices:
        output.append(price.get_text().strip())
        print(price.get_text().strip())
    newUrl = page['href']
    olxprices = Olx(newUrl).getPrice()

相关问题 更多 >