Python outfile没有写入所有内容

2024-06-15 09:31:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有代码可以让我执行相当复杂的任务(至少对我来说):

import csv
import os.path
#open files + readlines
with open("C:/Users/Ivan Wong/Desktop/Placement/Lists of targets/Mouse/UCSC to Ensembl.csv", "r") as f:
    reader = csv.reader(f, delimiter = ',')
    #find files with the name in 1st row
    for row in reader:
        graph_filename = os.path.join("C:/Python27/Scripts/My scripts/Top targets",row[0]+"_nt_counts.txt.png")
        if os.path.exists(graph_filename):
            y = row[0]+'_nt_counts.txt'  
            r = open('C:/Users/Ivan Wong/Desktop/Placement/fp_mesc_nochx/'+y, 'r')
            k = r.readlines()
            r.close
            del k[:1]
            k = map(lambda s: s.strip(), k)
            interger = map(int, k)   
            import itertools
            #adding the numbers for every 3 rows
            def grouper(n, iterable, fillvalue=None):
                "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
                args = [iter(iterable)] * n
                return itertools.izip_longest(*args, fillvalue=fillvalue)
            result = map(sum, grouper(3, interger, 0))       
            e = row[0]
            print e
            cDNA = open('C:/Users/Ivan Wong/Desktop/Placement/Downloaded seq/Mouse/MOUSE_mRNAs.txt', 'r')
            seq = cDNA.readlines()
            # get all lines that have a gene name
            lineNum = 0;
            lineGenes = []
            for line in seq:
                lineNum = lineNum +1
                if '>' in line:
                    lineGenes.append(str(lineNum))
                if '>'+e in line:
                    lineBegin = lineNum

            cDNA.close

            # which gene is this
            index1 = lineGenes.index(str(lineBegin))
            lineEnd = lineGenes[index1+1]           
# linebegin and lineEnd now give you, where to look for your sequence, all that 
# you have to do is to read the lines between lineBegin and lineEnd in the file
# and make it into a single string.            
            lineEnd = lineGenes[index1+1]
            Lastline = int(lineEnd) -1

# in your code you have already made a list with all the lines (q), first delete
# \n and other symbols, then combine all lines into a big string of nucleotides (like this)     
            qq = seq[lineBegin:Lastline]
            qq = map(lambda s: s.strip(), qq)
            string  = ''
            for i in range(len(qq)):
                string = string + qq[i]
# now you want to get a list of triplets, again you can use the for loop:
# first get the length of the string
            lenString = len(string);
# this is your list codons
            listCodon = []
            for i in range(0,lenString/3): 
                listCodon.append(string[0+i*3:3+i*3])
            proper_result = '\n'.join('%s, %s' % (nr, codon) for nr, codon in zip(result, listCodon))
            with open(e+'.csv','wb') as outfile:
                outfile.writelines(proper_result)

这些代码从.csv中读取文件,从文件夹中识别具有相同名称的文件,如果它们存在,则继续处理一些数据并将其写入.csv 有了它们,我的外文现在看起来像这样outfile

它看起来很好,但是有一个问题,我从我的数据中知道(我以不同的方式检查它),第二列应该比我得到的要长。我想是因为代码是在result(数字)和listCodon(字母)都存在的情况下写文件的,所以我遗漏了一些东西。我怎样才能修好它?在

我试图在写文件之前打印listCodon,结果发现所有的三胞胎仍然存在,所以我猜测问题就在这里:

^{pr2}$

Tags: csvthetoinyouforstringresult
1条回答
网友
1楼 · 发布于 2024-06-15 09:31:54

^{}将在其iterables的任何停止时立即停止(否则它将不知道用什么来填充空格!)公司名称:

The returned list is truncated in length to the length of the shortest argument sequence.

如果要将较短的iterable填充到最长的长度,请使用^{}(它接受一个可选参数作为填充值)。在

相关问题 更多 >