使用beauthoulsoup查找特定标记

网友

1楼 · 编辑于 2024-09-28 20:45:33

text参数在这种特殊情况下不起作用。这与如何计算元素的^{} property有关。相反，我将使用search function，在这里您可以实际调用get_text()并检查包含子节点的元素的完整“文本”：

label = thesoup.find(lambda tag: tag and tag.name == "th" and \
                                 "Residential" in tag.get_text())
comres = label.find_next("td").get_text()
print(str(comres))

打印Commercial。在

我们可以更进一步，创建一个可重用函数以通过标签获得值：

^{pr2}$

印刷品：

Commercial
NYC

网友

2楼 · 编辑于 2024-09-28 20:45:33

你只缺一点家务活：

ths = thesoup.find_all("th")
for th in ths:
    if 'Residential or' in th.text:
        comres = th.find_next("td").text
        print(str(comres))
        >> Commercial

网友

3楼 · 编辑于 2024-09-28 20:45:33

您需要使用正则表达式作为文本字段，如re.compile('Residential or')，而不是字符串。在

这对我很有用。我不得不反复查看提供的结果，但是如果您只希望每个页面都有一个结果，您可以将find替换为find_all：

for r in thesoup.find_all(text=re.compile('Residential or')):
    r.find_next('td').text

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用beauthoulsoup查找特定标记

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >