向列添加多索引

2024-10-05 19:23:33 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将多个文件连接在一起并输出到excel文件。我的计划是将数据读入一个数据框,进行一些计算,然后将数据写入excel表格。我想在我的数据框中添加第二个标签,指示它来自的文件。我相信多索引是一条路要走,但我不确定如何添加。你知道吗

当前数据帧示例:

    readout    readout
0    1.098      4.514
1    3.185      2.124 
2    0.938      0.369
3    5.283      7.840

预期数据帧示例:

    file_1     file_2
    readout    readout
0    1.098      4.514
1    3.185      2.124 
2    0.938      0.369
3    5.283      7.840

这是我目前使用的代码。你知道吗

# import excel sheet into dataframe
well_reads = pd.read_excel('File.xls', header=0)

# pull positive control and negative control samples into new dataframe
positive_control = well_reads[well_reads['Well'].str.contains('01')]
negative_control = well_reads[well_reads['Well'].str.contains('12')]

# drop postive control and negative control rows from initial dataframe
positive_control_wells = well_reads[well_reads['Well'].str.contains('01')]
index = positive_control_wells.index
well_reads = well_reads.drop(well_reads.index[index])
well_reads = well_reads.reset_index(drop=True)

negative_control_wells = well_reads[well_reads['Well'].str.contains('12')]
index = negative_control_wells.index
well_reads = well_reads.drop(well_reads.index[index])
well_reads = well_reads.reset_index(drop=True)

# Create data frame just containing reads and well id
neutralization_data = well_reads[['CPS (CPS)', 'Well']]

# set index to well id
neutralization_data = neutralization_data.set_index(['Well'])

# identify the geometric mean of the plate
geomean = scipy.stats.gmean(well_reads['CPS (CPS)'])

# identify the IC50 of the plate
IC_50 = geomean/2

# identify the IC80 of the plate
IC_80 = geomean * 0.2


# create a pandas excel writer using xlsxwriter as the engine
writer = pd.ExcelWriter('neutralization data.xlsx', engine='xlsxwriter')

# convert the dataframe to an xlsxwriter excel object
neutralization_data.to_excel(writer, sheet_name='Neutralization Data', startrow=1)

# close the pandas excel writer and output the file
writer.save()

Tags: the数据dataframedataindexexcelcontroldrop
1条回答
网友
1楼 · 发布于 2024-10-05 19:23:33

如您所说,添加多索引列将在您编写输出之前解决您的问题:

df=pd.DataFrame({0:[1.098,3.185,0.938, 5.283],1:[4.514,2.124,0.369, 7.840]})
df.columns=pd.MultiIndex.from_tuples([('file1','readout'),('file2','readout')])

给予

    file1   file2
  readout readout
0   1.098   4.514
1   3.185   2.124
2   0.938   0.369
3   5.283   7.840

相关问题 更多 >