索引器：在使用beautifulsoup创建广告时，列表索引超出范围

from bs4 import BeautifulSoup from requests import get import pandas as pd import itertools import matplotlib.pyplot as plt headers = ({'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}) link = 'https://ogloszenia.trojmiasto.pl/nieruchomosci/wi,100,dw,1d.html?' + str(strona) r = get(link, headers = headers) zupa = BeautifulSoup(r.text, 'html.parser') ogloszenia= zupa.find_all('div', class_="list__item") n_stron = 0 numer = 0 for strona in range(0,12): n_stron +=1 for ogl in ogloszenia: tytul = ogl.find_all('h2', class_ ="list__item__content__title")[0].text powierzchnia = ogl.find_all('p', class_ ="list__item__details__icons__element__desc")[0].text liczba_pokoi = ogl.find_all('p', class_ ="list__item__details__icons__element__desc")[1].text pietro = ogl.find_all('p', class_ ="list__item__details__icons__element__desc")[2].text lokalizacja = ogl.find_all('p', class_ = "list__item__content__subtitle")[0].text cena = ogl.find_all('p', class_ = "list__item__price__value")[0].text cena_m = ogl.find_all('p', class_ = "list__item__details__info details--info--price")[0].text numer += 1 print(numer) print(tytul) print('Powierzchnia: ' + powierzchnia ) print('Lokalizacja: ' + lokalizacja ) print('Liczba pokoi: ' + liczba_pokoi ) print('Pietro: ' + pietro ) print('Cena: ' + cena ) print('Cena za metr kwadratowy: ' + cena_m +'\n')

3条回答

网友

1楼 · 编辑于 2024-09-26 22:55:01

您可以捕获IndexError异常并将变量设置为None或''

try:
    powierzchnia = ogl.find_all('p', class_ ="list__item__details__icons__element__desc")[0].text
except IndexError:
    powierzchnia = ''

对于其他变量也可能会遇到这种情况。对每个人重复同样的步骤。你知道吗

网友

2楼 · 编辑于 2024-09-26 22:55:01

尝试：

data = ogl.find_all('p', class_ ="list__item__details__icons__element__desc")
for idx,entry in enumerate(data):
    if idx == 0:
        print('powierzchnia {}'.format(entry.text))
    elif idx == 1:
        print('liczba_pokoi {}'.format(entry.text))
    else:
        print('pietro {}'.format(entry.text))

网友

3楼 · 编辑于 2024-09-26 22:55:01

我建议做两个改变。你知道吗

首先，尝试隔离函数中的重复命令。你知道吗

def findDetail(ogl, tag, class, index):
     return ogl.find_all(tag, class_ = class)[index].text

然后，在索引不可用的情况下，可以使用“try except”来处理它。这是处理Python中错误的标准方法：

def findDetail(ogl, tag, class, index):
    try:
        return ogl.find_all(tag, class_ = class)[index].text
    except IndexError:
        print(f”Could not find index {index} for {tag} with {class}”)
        return “”

那就叫它：

for ogl in ogloszenia:
    tytul = findDetail(ogl, “h2”, “"list__item__content__title", 0)
    powierzchnia = findDetail(ogl, ‘p’, "list__item__details__icons__element__desc", 0)

等等。如果找不到索引，则只打印一个空白字符串。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章