在python中如何计算分区中出现的字符数？

f = open("C:/CDRH3.txt", "r") from collections import Counter grab = 1 for line in f: line=line.rstrip() left,sep,right=line.partition(" H3 ") if sep: AminoAcidsFirst = right[:grab] AminoAcidsLast = right[-grab:] print ("first ",Counter(line[:] for line in AminoAcidsFirst)) print ("last ",Counter(line[:] for line in AminoAcidsLast)) f.close()

3条回答

网友
1楼 · 编辑于 2024-09-29 23:18:38

创建2个空列表并在每个循环中追加，如下所示：
f = open("C:/CDRH3.txt", "r") from collections import Counter grab = 1 AminoAcidsFirst = [] AminoAcidsLast = [] for line in f: line=line.rstrip() left,sep,right=line.partition(" H3 ") if sep: AminoAcidsFirst.append(right[:grab]) AminoAcidsLast.append(right[-grab:]) print ("first ",Counter(line[:] for line in AminoAcidsFirst)) print ("last ",Counter(line[:] for line in AminoAcidsLast)) f.close()
此处：
创建空列表：
AminoAcidsFirst = [] AminoAcidsLast = []
在每个循环中追加：
AminoAcidsFirst.append(right[:grab]) AminoAcidsLast.append(right[-grab:])

网友
2楼 · 编辑于 2024-09-29 23:18:38

不需要计数器：只需抓取spliting之后的最后一个标记，并计算第一个和最后一个字符：
first_counter = {} last_counter = {} for line in f: line=line.split()[-1] # grab the last token first_counter[line[0]] = first_counter.get(line[0], 0) + 1 last_counter[line[-1]] = last_counter.get(line[-1], 0) + 1 print("first ", first_counter) print("last ", last_counter)
输出
first {'P': 1, 'N': 1} last {'V': 2}

网友
3楼 · 编辑于 2024-09-29 23:18:38

我想指出两件重要的事

切勿透露计算机上的文件路径，如果您来自科学界，这一点尤其适用
使用with...as方法，您的代码可以更加python

现在是程序

from collections import Counter

filePath = "C:/CDRH3.txt"
AminoAcidsFirst, AminoAcidsLast = [], [] # important! these should be lists

with open(filePath, 'rt') as f:  # rt not r. Explicit is better than implicit
    for line in f:
        line = line.rstrip()
        left, sep, right = line.partition(" H3 ")
        if sep:
            AminoAcidsFirst.append( right[0] ) # really no need of extra grab=1 variable
            AminoAcidsLast.append( right[-1] ) # better than right[-grab:]
print ("first ",Counter(AminoAcidsFirst))
print ("last ",Counter(AminoAcidsLast))

不要做line.strip()[-1]，因为sep验证很重要

输出

first  {'P': 1, 'N': 1}
last  {'V': 2}

注意：数据文件可能会非常大，您可能会遇到内存问题或计算机挂起。那么，我可以建议你读懒书吗？接下来是更健壮的程序

from collections import Counter

filePath = "C:/CDRH3.txt"
AminoAcidsFirst, AminoAcidsLast = [], [] # important! these should be lists

def chunk_read(fileObj, linesCount = 100):
    lines = fileObj.readlines(linesCount)
    yield lines

with open(filePath, 'rt') as f:  # rt not r. Explicit is better than implicit
    for aChunk in chunk_read(f):
        for line in aChunk:
            line = line.rstrip()
            left, sep, right = line.partition(" H3 ")
            if sep:
                AminoAcidsFirst.append( right[0] ) # really no need of extra grab=1 variable
                AminoAcidsLast.append( right[-1] ) # better than right[-grab:]
print ("first ",Counter(AminoAcidsFirst))
print ("last ",Counter(AminoAcidsLast))

相关问题更多 >

编程相关推荐

热门问题

热门文章