在Python中使用BeautifulSoup提取HTML段落中的文本

 <a name="533660373"></a> Title: Point of Sale Threats Proliferate Severity: Normal Severity Published: Thursday, December 04, 2014 20:27 Several new Point of Sale malware families have emerged recently, to include LusyPOS,.. Analysis: Emboldened by past success and media attention, threat actors .. 

1条回答

网友

1楼 · 发布于 2024-10-01 11:25:39

使用^{}和^{}可查找所有文本节点，^{}仅在父标记{}的直接子节点中搜索：

from bs4 import BeautifulSoup

data = """
<p>
    <a name="533660373"></a>
    <strong>Title: Point of Sale Threats Proliferate</strong><br />
    <strong>Severity: Normal Severity</strong><br />
    <strong>Published: Thursday, December 04, 2014 20:27</strong><br />
    Several new Point of Sale malware families have emerged recently, to include LusyPOS,..<br />
    <em>Analysis: Emboldened by past success and media attention, threat actors  ..</em>
    <br />
</p>
"""

soup = BeautifulSoup(data)
print ''.join(text.strip() for text in soup.p.find_all(text=True, recursive=False))

印刷品：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

在Python中使用BeautifulSoup提取HTML段落中的文本

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >