使用Regexp捕获子字符串python

网友

1楼 · 编辑于 2024-06-14 21:12:08

你似乎想匹配一些介于2 :之间的符号，而.*?可以匹配0个符号，你的正则表达式可以匹配::，我认为这不是你想要的。值得注意的是，re.search只返回第一个匹配，要获得多个匹配，通常使用re.findall或re.finditer。你知道吗

我想你需要

set(re.findall(r':[^:]+:', x))

或者如果只需要匹配:...:中的单词字符：

set(re.findall(r':\w+:', x))

或者-如果要匹配两个:之间的任何非空白字符：

set(re.findall(r':[^\s:]+:', x))

re.findall将查找所有不重叠的事件，set将删除重复。你知道吗

模式将匹配:，然后匹配除:（[^:]+）以外的1+个字符（或1个或多个字母、数字和_），然后再匹配:。你知道吗

>>> import re
>>> x = 'Wish she could have told me herself. @NicoleScherzy #nicolescherzinger #OneLove #myfav #MyQueen :heavy_black_heart::heavy_black_heart: some string too :smiling_face:'
>>> print(set(re.findall(r':[^:]+:', x)))
{':smiling_face:', ':heavy_black_heart:'}
>>>

网友

2楼 · 编辑于 2024-06-14 21:12:08

试试这个正则表达式：

:([a-z0-9:A-Z_]+):

网友

3楼 · 编辑于 2024-06-14 21:12:08

print re.findall(':.*?:', x)正在做这项工作。你知道吗

输出：
['：沉重的黑心：'，'：沉重的黑心：'，'：微笑的脸：']

但如果要删除重复项：

用途：

res = re.findall(':.*?:', x)
dictt = {x for x in res}
print list(dictt)

输出：
['：沉重的\u黑心：'，'：笑脸：']

相关问题更多 >

编程相关推荐

热门问题

热门文章