Pandas数据帧导出到excel导致TypeE

2024-09-30 08:25:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将数据帧从pandas导出到excel,方法是:

writer = pd.io.excel.ExcelWriter(args.out_file, engine='xlsxwriter', options={'constant_memory': True})
summary_data.to_excel(writer, sheet_name='summary', na_rep='NA', index=False)

但我得到的信息是:

^{pr2}$

我的数据帧没有什么问题,所以我对这个错误消息有点困惑,当数据帧包含的行数少于1000行时,就会发生这种情况,但是一旦数据帧变大,就会发生这种错误

有什么想法吗?在

谢谢

更新摘要_数据信息()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2176 entries, 0 to 2175
Data columns (total 27 columns):
chrom                                   2176 non-null object
coord                                   2176 non-null int64
ref_base                                2176 non-null object
var_base                                2176 non-null object
normal_ref_counts                       2176 non-null int64
normal_var_counts                       2176 non-null int64
VOA867-A1_S43_merged_ref_counts         2176 non-null object
VOA867-A1_S43_merged_var_counts         2176 non-null object
VOA867-A1_S43_merged_somatic_status     2176 non-null object
VOA867-E02_S73_merged_ref_counts        2176 non-null object
VOA867-E02_S73_merged_var_counts        2176 non-null object
VOA867-E02_S73_merged_somatic_status    2176 non-null object
VOA867-F03_S76_merged_ref_counts        2176 non-null object
VOA867-F03_S76_merged_var_counts        2176 non-null object
VOA867-F03_S76_merged_somatic_status    2176 non-null object
VOA867-F04_S75_merged_ref_counts        2176 non-null object
VOA867-F04_S75_merged_var_counts        2176 non-null object
VOA867-F04_S75_merged_somatic_status    2176 non-null object
VOA867-F09_S74_merged_ref_counts        2176 non-null object
VOA867-F09_S74_merged_var_counts        2176 non-null object
VOA867-F09_S74_merged_somatic_status    2176 non-null object
VOA867-T_S41_merged_ref_counts          2176 non-null object
VOA867-T_S41_merged_var_counts          2176 non-null object
VOA867-T_S41_merged_somatic_status      2176 non-null object
VOA867xeno_S18_merged_ref_counts        2176 non-null object
VOA867xeno_S18_merged_var_counts        2176 non-null object
VOA867xeno_S18_merged_somatic_status    2176 non-null object
dtypes: int64(3), object(24)None

这是生成它的函数

def get_summary_data(data, normal_sample):
    summary_data = []
    for index, normal_row in data[normal_sample].iterrows():
        out_row = {'chrom': index[0],
                   'coord': index[1],
                   'ref_base': normal_row['ref_base'],
                   'var_base': normal_row['var_base'],
                   'normal_ref_counts': normal_row['ref_counts'],
                   'normal_var_counts': normal_row['var_counts'],
                   }

        normal_variant_status = normal_row['variant_status']

        normal_depth = out_row['normal_ref_counts'] + out_row['normal_var_counts']

        if normal_depth > 0:
            normal_var_freq = out_row['normal_var_counts'] / normal_depth
        else:
            normal_var_freq = 0

        for sample in data:
            if sample == normal_sample:
                 continue

            sample_row = data[sample].ix[[index]]

            out_row['{0}_ref_counts'.format(sample)] = sample_row['ref_counts']

            out_row['{0}_var_counts'.format(sample)] = sample_row['var_counts']

            sample_variant_status = str(sample_row['variant_status'].iget(0))

            sample_somatic_status = call_somatic_status(normal_variant_status,
                                                        sample_variant_status,
                                                        normal_var_freq,
                                                        args.min_normal_germline_var_freq)

            out_row['{0}_somatic_status'.format(sample)] = sample_somatic_status

        summary_data.append(out_row)

    columns = ['chrom', 'coord', 'ref_base', 'var_base', 'normal_ref_counts', 'normal_var_counts']

    for sample in data:
        if sample == normal_sample:
            continue

        columns.append('{0}_ref_counts'.format(sample))

        columns.append('{0}_var_counts'.format(sample))

        columns.append('{0}_somatic_status'.format(sample))

    summary_data = pd.DataFrame(summary_data, columns=columns)

    return summary_data

count应该是int,但我可以看到它在这里被认为是字符串,可能是因为它是从另一个数据帧提取的?在


Tags: samplerefdataobjectvarstatusmergedout
1条回答
网友
1楼 · 发布于 2024-09-30 08:25:25

.to\u excel只接受类型为object的列。解决此问题的快速方法是在写入之前将所有列强制为对象类型:

summary_data = summary_data.astype(object)

这样你就可以在不崩溃的情况下编写它:

^{pr2}$

这里有一些咀嚼要做,因为在某些情况下,我必须复制列作为对象类型。威尔德。另一个选择是删除有问题的列。在

相关问题 更多 >

    热门问题