如何用python获取html标签?

2024-06-28 19:12:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我想查询一个搜索引擎与查询,以改善这个引擎的排序功能。 在selenium的帮助下,我发送请求并检索页面的html,在html中我感兴趣的是标签“result line”,在result line中,我只想检索单击的url及其在答案列表中的排名。这是我的代码和显示的错误。你知道吗

def get_result_line(browser):
    html = browser.page_source
    soup = BeautifulSoup(html)
    h1 = soup.find_all("li", {"class":"result-line"})
    return(h1)

返回标记结果行的函数

这是函数的调用

with open('U:\\Python\\test.csv') as csvDataFile:
    csvReader = csv.reader(csvDataFile)
    next(csvReader ,None)

    for row in csvReader:
        requete=row[0]
        print (requete)
        h=get_result_line(send_requete(browser,requete))
        h2 = h.find("a",{"class":"result-options report-result"})
        with open('requete_url2.csv', 'a') as csvFile:
            writer = csv.writer(csvFile)
            row_write= (requete,h2)
            writer.writerow(row_write)
        csvFile.close()
        print(h2)
  File "U:\Python\script.py", line 30
    soup = BeautifulSoup(html)
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 30 of the file U:\Python\script.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.

Traceback (most recent call last):
  File "U:\Python\appel_fonction.py", line 23, in <module>
    h2 = h.find("a",{"class":"result-options report-result"})
  File "C:\Users\LFXF9956\AppData\Local\Programs\Python\Python36-32\lib\site-packages\bs4\element.py", line 1620, in __getattr__
    "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key
AttributeError: ResultSet object has no attribute 'find'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()? ```

Tags: csvtheinpyyouparserhtmlline