使用正则表达式Python搜索和捕获字符

DQheAbsaMLjTmAOKmNsLziVMenFxQdATQIjItwtyCHyeMwQTNxbbLXWZnGmDqHhXnLHfEyvzxMhSXzd BEBaxeaPgQPttvqRvxHPEOUtIsttPDeeuGFgmDkKQcEYjuSuiGROGfYpzkQgvcCDBKrcYwHFlvPzDMEk MyuPxvGtgSvWgrybKOnbEGhqHUXHhnyjFwSfTfaiWtAOMBZEScsOSumwPssjCPlLbLsPIGffDLpZzMKz jarrjufhgxdrzywWosrblPRasvRUpZLaUbtDHGZQtvZOvHeVSTBHpitDllUljVvWrwvhpnVzeWVYhMPs kMVcdeHzFZxTWocGvaKhhcnozRSbWsIEhpeNfJaRjLwWCvKfTLhuVsJczIYFPCyrOJxOPkXhVuCqCUgE luwLBCmqPwDvUPuBRrJZhfEXHXSBvljqJVVfEGRUWRSHPeKUJCpMpIsrV.......

3条回答

网友

1楼 · 编辑于 2024-09-23 06:32:39

import re

with open('/Users/Dev/Sometext.txt','r') as f: 
    tokens = re.findall(r'[a-z][A-Z]{3}([a-z])[A-Z]{3}[a-z]', f.read())

    for token ins tokens:
        print token

findall做什么：

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

可能是re模块中最有用的函数。在

函数的作用是：将整个文件读入大字符串。如果需要将正则表达式与整个文件相匹配，这一点尤其有用。在

警告：根据文件的大小，您可能喜欢像第一种方法那样逐行迭代文件。在

网友

2楼 · 编辑于 2024-09-23 06:32:39

将result.groups()更改为result.group(1)，您将只得到单个字母的匹配。在

代码的第二个问题是它不会在一行找到多个结果。你需要用cd4{3}代替cd3}。findall将返回字符串或字符串元组，而finditer返回匹配对象。在

我在这里找到了同样的问题：

import urllib
import re    

pat = re.compile('[a-z][A-Z]{3}([a-z])[A-Z]{3}[a-z]')
print ''.join(pat.findall(urllib.urlopen(
    "http://www.pythonchallenge.com/pc/def/equality.html").read()))

注意，re.findall和{}返回非重叠结果。因此，当使用上面的模式时，re.findall搜索字符串'aBBBcDDDeFFFg'，唯一匹配的将是'c'，而不是{}。幸运的是，这个Python挑战问题不包含这样的示例。在

网友

3楼 · 编辑于 2024-09-23 06:32:39

我建议使用环视：

(?<=[A-Z]{3})(?<![A-Z].{3})([a-z])(?=[A-Z]{3})(?!.{3}[A-Z])

这对重叠匹配没有问题。在

说明：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章