无法从html<section>中找到所需的内容:Python BS4

2024-09-29 06:32:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我要取消这个website用于天气预报。它的html是非常嵌套的,我无法检索所需的内容。我想刮的部分如下:

Section to be scraped

为了获取本节html所需的标记,我使用以下方法:

import requests
from bs4 import BeautifulSoup
import uuid
import csv
import dateutil.parser as parser


class met():
    def __init__(self):
        global homePage
        global downloadDir
        global filname
        downloadDir = ""
        uFileName = str(uuid.uuid4())
        filname = downloadDir + uFileName + ".csv"
        homePage = requests.get("https://www.met.ie/")

    def pageHtml(self):

        soup = BeautifulSoup(homePage.content, 'html.parser')
        titleList = soup.findAll('title')
        for div in soup.find_all("div"):
            for section in div.find_all('section', class_ = "container hourly-forecast mb-5",id = "24HourForecast"):
                #print(section)
                for div1 in section.find_all("div"):
                    print(div1)


if __name__ == '__main__':
    objCall = met()
    objCall.pageHtml()

运行此代码后,我可以检索标记,但所有这些标记都来自之前的容器。有人能帮我走正确的路吗,谢谢


Tags: in标记importdivparserforhtmlsection