Python - 使用正则表达式查找多个匹配并打印它们出来 - 问答

line = 'bla bla bla<form>Form 1</form> some text...<form>Form 2</form> more text?' matchObj = re.search('<form>(.*?)</form>', line, re.S) print matchObj.group(1) # Output: Form 1 # I need it to output every form content he found, not just first one...

3条回答

网友

1楼 · 编辑于 2024-04-27 15:52:01

使用re.search而不是使用re.findall，它将返回List中的所有匹配项。或者您也可以使用re.finditer（我最喜欢使用它）它将返回一个Iterator Object，您可以使用它来遍历所有找到的匹配项。

line = 'bla bla bla<form>Form 1</form> some text...<form>Form 2</form> more text?'
for match in re.finditer('<form>(.*?)</form>', line, re.S):
    print match.group(1)

网友

2楼 · 编辑于 2024-04-27 15:52:01

Do not use regular expressions to parse HTML.

但是，如果需要在字符串中查找所有regexp匹配项，请使用^{}函数。

import re
line = 'bla bla bla<form>Form 1</form> some text...<form>Form 2</form> more text?'
matches = re.findall('<form>(.*?)</form>', line, re.DOTALL)
print(matches)

# Output: ['Form 1', 'Form 2']

网友

3楼 · 编辑于 2024-04-27 15:52:01

为此目的使用正则表达式是错误的方法。因为您使用的是python，所以有一个非常棒的库可以从HTML文档中提取部分：BeautifulSoup。

Python - 使用正则表达式查找多个匹配并打印它们出来

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python - 使用正则表达式查找多个匹配并打印它们出来

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >