求连续列的平均值

name x y gh_00hr_bio_rep1 gh_00hr_bio_rep2 gh_00hr_bio_rep3 gh_06hr_bio_rep1 gene1 x y 2 3 2 1 gene2 x y 5 7 6 2

2条回答

网友

1楼 · 编辑于 2024-06-28 20:06:58

你很接近

df['avg'] = df.iloc[:, 2:].mean(axis=1)

你会得到这个：

       x  y  gh_00hr_bio_rep1  gh_00hr_bio_rep2  gh_00hr_bio_rep3  gh_06hr_bio_rep1  avg
gene1  x  y                 2                 3                 2                 1  2.0
gene2  x  y                 5                 7                 6                 2  5.0

如果您希望从不同的列集合中获得平均值，可以执行以下操作：

for col in range(10):
    df['avg%i' % col] = df.iloc[:, 2+col*5:7+col*5].mean(axis=1)

如果每个平均值的列数相同。否则，您可能需要使用rep列的名称，这取决于数据的外观

网友

2楼 · 编辑于 2024-06-28 20:06:58

我将首先构建一系列由原始列索引的最终名称：

names = pd.Series(['_'.join(i.split('_')[:-1]) for i in df.columns[3:]],
                  index = df.columns[3:])

然后我会用它来询问轴1上的groupby的平均值：

tmp = df.iloc[:, 3:].groupby(names, axis=1).agg('mean')

它提供了一个与原始数据帧相似的索引新数据帧，并具有平均列：

   gh_00hr_bio  gh_06hr_bio
0     2.333333          1.0
1     6.000000          2.0

然后，您可以将其水平连接到第一个数据帧或其前3列：

result = pd.concat([df.iloc[:, :3], tmp], axis=1)

要获得：

    name  x  y  gh_00hr_bio  gh_06hr_bio
0  gene1  x  y     2.333333          1.0
1  gene2  x  y     6.000000          2.0

相关问题更多 >

编程相关推荐

热门问题

热门文章

求连续列的平均值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >