使用BS4循环提取HTML数据

for i in rows: url = "https://www.boerse-stuttgart.de/en/products/investment-products/discount-certificates/stuttgart/{}".format(i) r = requests.get(url) soup = BeautifulSoup(r.text, "html.parser") first_day = soup.find("dt", text="First exchange day").findNext('dd').string

2条回答

网友

1楼 · 编辑于 2024-09-30 07:29:28

你得到'NoneType' object has no attribute 'findNext'是因为你在对不存在的东西调用.findNext()

如果页面上没有带有文本First exchange day的dt元素，则此行

soup.find("dt", text="First exchange day")

返回None。当您尝试调用此None值上的.findNext()时，会出现错误

您需要添加一个签入，确保您确实在页面上找到了dt元素；如果你找到它，那么就做findNext()。如果你没有，那就跳过这一步

网友

2楼 · 编辑于 2024-09-30 07:29:28

当您试图在数据不存在的行中查找数据时，会发生此错误

for i in rows:
    url = "https://www.boerse-stuttgart.de/en/products/investment-products/discount-certificates/stuttgart/{}".format(i)
    r = requests.get(url)
    soup = BeautifulSoup(r.text, "html.parser")
    try:
        first_day = soup.find("dt", text="First exchange day").findNext('dd').string
    except:
        print('The required data does not exist in this row')

如果数据可用，执行上述更改将使您能够提取数据，否则它将只打印不可用的数据。您也可以使用if-else，但这是最简单的方法

相关问题更多 >

编程相关推荐

热门问题

热门文章