擅长:python、mysql、java
<p><code>\\VariantSimple=((?:\([^\)]+\))*) \\Processed=((?:\([^\)]+\))*) ([\s\S]*?)(?:\n*>|$)</code></p>
<p>这个正则表达式将捕获你的氨基酸序列。在结束“已处理”数据字段之后,它捕获跨行的所有字符,直到它到达紧跟<code>></code>字符的换行符,或者一行的结尾。这应该适合您的python代码。你知道吗</p>
<p><a href="https://regex101.com/r/rJf0Np/1/" rel="nofollow noreferrer">Regex demo</a></p>
<p>一个示例代码看起来像这样;它将匹配尽可能多的氨基酸字符串,然后将它们打印出来。你知道吗</p>
<pre class="lang-py prettyprint-override"><code>import re
with open('data.txt', 'r') as fil:
data = fil.read()
rex = re.compile("\\\VariantSimple=(?:\([^\)]+\))* \\\Processed=(?:\([^\)]+\))* ([\s\S]*?)(?:\n*>|$)")
rex2 = re.compile("Variant")
out = re.findall(rex, data)
for mtch in out:
print(mtch + "\n")
</code></pre>
<p>输出:</p>
<pre><code>MLSPDLPDSAWNTRLLCRVMLCLLGAGSVAAGVIQSPRHLIKEKRETATLKCYPIPRHDT VYWYQQGPGQDPQFLISFYEKMQSDKGSIPDRFSAQQFSDYHSELNMSSLELGDSALYFC ASSL
MEDSSLSSGVDVDKGFAIAFVVLLFLFLIVMIFRCAKLVKNPYKASSTTTEPSLS
</code></pre>
<p><a href="https://repl.it/repls/BustlingBlondPhysics" rel="nofollow noreferrer">Python demo</a></p>