为什么是关于芬德尔方法在python中返回错误的结果？

网友

1楼 · 编辑于 2024-10-02 20:38:47

re.findall返回所有组。所以使用

re.findall(r'(?:\d{2}){2}', 'shs111111111')

只需使组non capturing。在

相关文件摘录：

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

^{pr2}$

用这个和

x="aaaaaaaccccctttttttttt"
print [i[0] for i in re.findall(r'(([acgt])\2+)', 'aaaaaaaccccctttttttttt')]

网友

2楼 · 编辑于 2024-10-02 20:38:47

我相信你需要这个正则表达式：

>>> print re.findall(r'(?:\d{2}){2,}', 'shs111111111');
['11111111']

编辑：根据编辑的问题，您可以使用：

^{pr2}$

从每对中抓取一组。在

使用finditer：

>>> arr=[]
>>> for match in re.finditer(r'(([actg\d])\2+)', 'aaaaaaaccccctttttttttt') :
...     arr.append( match.groups()[0] )
...
>>> print arr
['aaaaaaa', 'ccccc', 'tttttttttt']

网友

3楼 · 编辑于 2024-10-02 20:38:47

您无法获得纯['aaaaaaa', 'ccccc', 'tttttttttt']，因为您需要一个捕获组来使用back引用检查重复性。这里，您有一个名为group letter的regex，它将包含a，或{}等，然后使用(?P=letter)+)反向引用来匹配所有的组重复。在

((?P<letter>[a-zA-Z])(?P=letter)+)

您只能将此正则表达式与@anubhava的帖子中描述的finditer一起使用。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

为什么是关于芬德尔方法在python中返回错误的结果？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >