如何使用find\u All（）用Python将正则表达式应用于BeautifulSoup

2条回答

网友

1楼 · 编辑于 2024-09-26 17:53:29

如果id开始字符串对于感兴趣的表是唯一的，那么您可以不使用attribute=value css selector并以operator开始吗？你知道吗

for table in soup.select('table[id^=table]'):
    #do something with table

网友

2楼 · 编辑于 2024-09-26 17:53:29

可以使用正则表达式'table[12]\d'（regex101）：

data = '''<table id='table19'><tr></tr></table>
<table id='table20'><tr></tr></table>
<table id='table21'><tr></tr></table>

<table id='table40'><tr></tr></table>'''

from bs4 import BeautifulSoup
import re

soup = BeautifulSoup(data, 'html.parser')

for table in soup.find_all('table', {'id':re.compile(r'table[12]\d')}):
    print(table)

印刷品：

<table id="table19"><tr></tr></table>
<table id="table20"><tr></tr></table>
<table id="table21"><tr></tr></table>

编辑：对于表19或20-29，使用非捕获组（regex101）：

for table in soup.find_all('table', {'id':re.compile(r'table(?:19|2\d)')}):
    print(table)

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用find\u All（）用Python将正则表达式应用于BeautifulSoup

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >