python正则表达式模式检索

import os,re,sys t="LOC_Os01g01010.1 GO:0030234 F enzyme regulator activity IEA TAIR:AT3G59570" k =['LOC_Os01g01010'] re_search=re.search(re.escape(k[0] + r'.1 GO:\d{7}'),t,re.M|re.I|re.S) if re_search is None: pass else: print re_search.group()

3条回答

网友

1楼 · 编辑于 2024-09-30 00:36:55

如果有任何解决方案，有（可证明）无限的解决方案，一个正则表达式，可以匹配一个无界字符串中的一组有限的例子。你知道吗

这是一种包容的方式，你需要更具体，因为给我们只有一个例子，你试图匹配，我们将能够为你提供多种解决方案，取决于进一步（未指明）的假设，我们添加自己。你知道吗

以下是一些假设：

>>> import re
>>> t = "LOC_Os01g01010.1 GO:0030234  F   enzyme regulator activity   IEA     TAIR:AT3G59570"
>>> re.findall('\w+\.\d+', t) # any alphnumeric sequence, followed by dot and digits
['LOC_Os01g01010.1']
>>> re.findall('[A-Z]+_\w+\.\d+', t) # forcing token to start with capitals and underscore
['LOC_Os01g01010.1']
>>> re.findall('[A-Z]+_O[a-z01]+\.\d+', t) # forcing "O", and middle part to be only small letters and 0s and 1s
['LOC_Os01g01010.1']
>>> re.findall('^[A-Z]+_O[a-z01]+\.\d+', t) # forcing the pattern to be at the beginning of the string
['LOC_Os01g01010.1']```

网友

2楼 · 编辑于 2024-09-30 00:36:55

我相信以下几点可以解决你的问题：

import re
t="LOC_Os01g01010.1 GO:0030234  F   enzyme regulator activity   IEA     TAIR:AT3G59570"
my_regex = re.compile(r'^LOC_(.)*GO:\d{7}',re.M|re.I|re.S)
searches = my_regex.search(t)
if searches:
    print searches.group()

网友

3楼 · 编辑于 2024-09-30 00:36:55

考虑到你的例子和在LOC_********.*中星星可以是集合[a-zA-Z0-9]中的任何东西的期望，我建议：

import os,re,sys

t="LOC_Os01g01010.1 GO:0030234  F   enzyme regulator activity   IEA      TAIR:AT3G59570"
k =['LOC_Os01g01010']

re_search=re.search("(LOC_[0-9A-Z]*)",t,re.M|re.I|re.S)
if re_search is None:
      pass
else:
      print re_search.group()

python regexthing.py当我用python2.7运行它时，会产生LOC_Os01g01010。(LOC_[0-9A-Za-z]*)是一个捕获组，它捕获与表达式LOC_[0-9A-Z]*匹配的任何内容。此表达式将匹配LOC_、LOC_ABCabc123、LOC_a1B2C等

我希望这能回答你的问题。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章