附加到数据框

2024-06-16 10:49:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我想添加一个包含两列的数据框:read\u id和score

我正在使用以下代码:

    reads_array = []
    for x in Bio.SeqIO.parse("inp.fasta","fasta"):
             reads_array.append(x)

    columns = ["read_id","score"]
    df = pd.DataFrame(columns = columns)
    df = df.fillna(0)

    for x in reads_array:
                alignments=pairwise2.align.globalms("ACTTGAT",str(x.seq),2,-1,-.5,-.1)
                sorted_alignments = sorted(alignments, key=operator.itemgetter(2),reverse = True)
                read_id = x.name
                score = sorted_alignments[0][2]
                df['read_id'] = read_id
                df['score'] = score

但这行不通。你能建议一种生成数据帧df的方法吗


Tags: columns数据代码iniddfforread
2条回答

df['read_id']df['score']是级数。因此,如果要迭代reads_array并计算某个值,然后将其赋给df的列,请尝试以下操作:

for i, x in enumerate(reads_array):
    ...
    df.ix[i]['read_id'] = read_id
    df.ix[i]['score'] = score

在最上面一定要有

import numpy as np

然后替换共享的代码

reads_array = []
for x in Bio.SeqIO.parse("inp.fastq", "fastq"):
    reads_array.append(x)

df = pd.DataFrame(np.zeros((len(reads_array), 2)), columns=["read_id", "score"])

for index, x in enumerate(reads_array): 
    alignments = pairwise2.align.globalms("ACTTGAT", str(x.seq), 2, -1, -.5, -.1)
    sorted_alignments = sorted(alignments, key=operator.itemgetter(2), reverse=True)
    read_id = x.name
    score = sorted_alignments[0][2]
    df.loc[index, 'read_id'] = read_id
    df.loc[index, 'score'] = score

原始代码的主要问题有两个:

1)您的数据帧有0行

2)df['column\u name']指的是整个列,而不是单个单元格,因此当执行df['column\u name']=value时,该列中的所有单元格都被设置为该值

相关问题 更多 >