靓汤4：不带标签提取文本 - 问答 - Python中文网

靓汤4：不带标签提取文本

2024-09-30 20:31:26 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

<li class="actualPrice price fakeLink " data-automation="actual-price">
       <span class="visuallyhidden">Hello world</span>
Some text I want to extract
</li>

这里有一些HTML。我想提取文本“sometextextract”，而不是提取helloworld。在

我尝试过find（'span'）和使用next-sibling，但是没有。在

^{pr2}$

这给了我“你好世界”和“我想提取的一些文本”。有没有什么方法只提取“我想提取的一些文本”？在

Tags： text 文本 hello world data some li price

2条回答

网友

1楼 · 编辑于 2024-09-30 20:31:26

为了使用另一种方法，您可以使用stripped_strings：

for li in soup.find_all('li', 'actualPrice'):
    _, text_you_want = li.stripped_strings
    print (text_you_want)

输出：

Some text I want to extract

网友

2楼 · 编辑于 2024-09-30 20:31:26

如果要提取span标记后的下一个元素，则可以使用.next：

>>> for a in soup.find_all('li', 'actualPrice'):
        print(a.span.next.next)
Some text I want to extract

相关问题更多 >

编程相关推荐

热门问题

热门文章