使用python从文本中提取以符号开头并与其他字符串组合的字符串

'http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#Reference http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#Informal ACADEMIC type http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#school ACADEMIC type'

2条回答

网友

1楼 · 编辑于 2024-10-03 11:26:44

将其转过来；删除URL:

re.sub(r'\bhttps?://[^# ]+#?', '', text1)

演示：

>>> import re
>>> text1 = '\bhttp://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#Reference http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#Informal ACADEMIC type http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#school ACADEMIC type'
>>> re.sub(r'https?://[^# ]+#?', '', text1)
'Reference Informal ACADEMIC type school ACADEMIC type'

表达式查找以http://或https://开头的任何内容，并删除其后不是哈希或空格的任何内容，包括可选哈希。你知道吗

网友

2楼 · 编辑于 2024-10-03 11:26:44

使用re.findall：

>>> import re
>>> s = 'http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#Reference http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#Informal ACADEMIC type http://oaei.ontologymatching.org/2011/benchmarks/101/onto.rdf#school ACADEMIC type'
>>> ''.join(re.findall(r'#(.*?)(?=https?:|$)', s))
'Reference Informal ACADEMIC type school ACADEMIC type'

说明：http://regex101.com/r/dV5uR2

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用python从文本中提取以符号开头并与其他字符串组合的字符串

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >