如何将包含子标签的标签与BeautifulSoup4中的空标签分开?

2024-09-29 19:31:05 发布

您现在位置:Python中文网/ 问答频道 /正文

<a id="filepos10190"></a>
<a id="filepos10190">

<font size="6" color="#002984"><b>abashed </b></font> <div width="9"><i> 
<font color="green"> adj.</font></i></div> <div width="18"><font 
color="chocolate"><b>VERBS </b></font></div> <div width="27"><font 
color="gray">▪</font> <font color="darkslateblue"><b>be</b></font>, <font 
color="darkslateblue"><b>look</b></font></div> <div width="18"><font 
color="chocolate"><b>ADVERB </b></font></div> <div width="27"><font 
color="gray">▪</font> <font color="darkslateblue"><b>a little</b></font>, 
<font color="darkslateblue"><b>slightly</b></font>, <font 
color="darkslateblue"><b>etc.</b></font></div> <div width="27"><font 
color="gray">▪</font> <font color="darkslateblue"><b>suitably</b></font> 
</div> <div width="36"><font color="lightgray">▪</font> <span><font 
color="#595959">He glanced at Juliet accusingly and she looked suitably 
<u>~</u>.</font></span></div> 

</a>

这里有两个锚标签,一个里面什么都没有,另一个里面有很多孩子。如果我只想要里面有标签的那一个,刮的时候怎么把这两个分开?你知道吗


Tags: dividsizegreen标签widthcolorspan
2条回答

你实际上可以一次完成:

soup.find_all(lambda tag: tag.name == 'a' and tag.find())

tag.find()会尝试在tag中找到任何元素,并且只找到一个元素。你知道吗

from bs4 import BeautifulSoup

content="""
<a id="filepos10190"></a>
<a id="filepos10190">

<font size="6" color="#002984"><b>abashed </b></font> <div width="9"><i>
<font color="green"> adj.</font></i></div> <div width="18"><font
color="chocolate"><b>VERBS </b></font></div> <div width="27"><font
color="gray">▪</font> <font color="darkslateblue"><b>be</b></font>, <font
color="darkslateblue"><b>look</b></font></div> <div width="18"><font
color="chocolate"><b>ADVERB </b></font></div> <div width="27"><font
color="gray">▪</font> <font color="darkslateblue"><b>a little</b></font>,
<font color="darkslateblue"><b>slightly</b></font>, <font
color="darkslateblue"><b>etc.</b></font></div> <div width="27"><font
color="gray">▪</font> <font color="darkslateblue"><b>suitably</b></font>
</div> <div width="36"><font color="lightgray">▪</font> <span><font
color="#595959">He glanced at Juliet accusingly and she looked suitably
<u>~</u>.</font></span></div>

</a>"""

soup = BeautifulSoup(content, 'html.parser')
tags = soup.find_all('a')  # just to filter your desire tag in this case anchor tag
filtered_tag = [i for i in tags if list(i.children)]  # results tags if it has child tags inside it otherwise empty list

相关问题 更多 >

    热门问题