如何正确使用split和Beautifulsoup?

2024-10-01 17:21:32 发布

您现在位置:Python中文网/ 问答频道 /正文

from bs4 import BeautifulSoup as bs
import requests
import time
import re


r = requests.get("https://www.crummy.com/software/BeautifulSoup/bs4/doc/")
soup = bs(r.content, "html.parser")
qqrcoisa = soup.find("h1")
print(qqrcoisa)
lista = qqrcoisa.split(" ")
print(lista)

错误:文件“C:/Users/claud/Desktop/Nova-意面/scrapando.py”,第13行,在 lista=qqrcosa.split(“文件”) TypeError:“非类型”对象不可调用


Tags: 文件fromimportrebstimeasrequests
3条回答

为了避免抛出错误,可以运行以下操作:

from bs4 import BeautifulSoup as bs
import requests

r = requests.get("https://www.crummy.com/software/BeautifulSoup/bs4/doc/")
soup = bs(r.content, "html.parser")
qqrcoisa = soup.find("h1")
if qqrcoisa:
    print(f"Found this h1 element: {qqrcoisa.text}")
    lista = qqrcoisa.tetx.split(" ")
    print(f"Split h1 element: {lista}")
else:
    print("No h1 element found")

您可以通过传递text将bs4元素转换为文本

qqrcoisa = soup.find("h1").text

如果要在<h1></h1>之间保留整行,可以将bs4元素转换为字符串

qqrcoisa = str(soup.find("h1"))

将“文本”函数添加到查找函数:

qqrcoisa = soup.find("h1").text

这将为您提供:

Beautiful Soup Documentation¶

拆分后:

['Beautiful', 'Soup', 'Documentation¶']

相关问题 更多 >

    热门问题