<p>我想指出两件重要的事</p>
<ol>
<li><p>切勿透露计算机上的文件路径,如果您来自科学界,这一点尤其适用</p></li>
<li><p>使用<code>with...as</code>方法,您的代码可以更加python</li>
</ol>
<p>现在是程序</p>
<pre><code>from collections import Counter
filePath = "C:/CDRH3.txt"
AminoAcidsFirst, AminoAcidsLast = [], [] # important! these should be lists
with open(filePath, 'rt') as f: # rt not r. Explicit is better than implicit
for line in f:
line = line.rstrip()
left, sep, right = line.partition(" H3 ")
if sep:
AminoAcidsFirst.append( right[0] ) # really no need of extra grab=1 variable
AminoAcidsLast.append( right[-1] ) # better than right[-grab:]
print ("first ",Counter(AminoAcidsFirst))
print ("last ",Counter(AminoAcidsLast))
</code></pre>
<p>不要做<code>line.strip()[-1]</code>,因为<code>sep</code>验证很重要</p>
<p><strong>输出</strong></p>
<pre><code>first {'P': 1, 'N': 1}
last {'V': 2}
</code></pre>
<p><strong>注意:</strong>数据文件可能会非常大,您可能会遇到内存问题或计算机挂起。那么,我可以建议你读懒书吗?接下来是更健壮的程序</p>
<pre><code>from collections import Counter
filePath = "C:/CDRH3.txt"
AminoAcidsFirst, AminoAcidsLast = [], [] # important! these should be lists
def chunk_read(fileObj, linesCount = 100):
lines = fileObj.readlines(linesCount)
yield lines
with open(filePath, 'rt') as f: # rt not r. Explicit is better than implicit
for aChunk in chunk_read(f):
for line in aChunk:
line = line.rstrip()
left, sep, right = line.partition(" H3 ")
if sep:
AminoAcidsFirst.append( right[0] ) # really no need of extra grab=1 variable
AminoAcidsLast.append( right[-1] ) # better than right[-grab:]
print ("first ",Counter(AminoAcidsFirst))
print ("last ",Counter(AminoAcidsLast))
</code></pre>