查找具有匹配列名的列的平均值

x y ghb_00hr_rep1 ghb_00hr_rep2 ghb_00hr_rep3 ghl_06hr_rep1 ghl_06hr_rep2 x y 2 3 2 1 3 x y 5 7 6 2 1

name = pd.Series(['_'.join(i.split('_')[:-1]) for i in df.columns[3:]], index = df.columns[3:] ) temp = df.groupby(name, axis=1).agg('mean') avg = pd.concat([df.iloc[:, :3], temp], axis=1 )

2条回答

网友

1楼 · 编辑于 2024-06-28 20:16:45

一个选项是按level=0分组：

(df.set_index(['name','x','y'])
   .groupby(level=0, axis=1)
   .mean().reset_index()
)

输出：

    name  x  y  ghb_00hr  ghl_06hr
0  gene1  x  y  2.333333       2.0
1  gene2  x  y  6.000000       1.5

更新：对于修改后的问题：

d = df.filter(like='gh')
# or d = df.iloc[:, 2:]
# depending on your columns of interest

names = d.columns.str.rsplit('_', n=1).str[0]

d.groupby(names, axis=1).mean()

输出：

   ghb_00hr  ghl_06hr
0  2.333333       2.0
1  6.000000       1.5

网友

2楼 · 编辑于 2024-06-28 20:16:45

您可以将df.columns转换为set，然后迭代：

df = pd.DataFrame([[1, 2, 3, 4, 5, 6]], columns=['a', 'a', 'a', 'b', 'b', 'b'])

for column in set(df.columns):
    print(column, df[common_name].mean(axis=1))

意志输出

a 0    2.0
dtype: float64
b 0    5.0
dtype: float64

如果顺序重要，请使用sorted：

for column in sorted(set(df.columns)):

从这里你可以得到你想要的任何格式的输出

相关问题更多 >

编程相关推荐

热门问题

热门文章

查找具有匹配列名的列的平均值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >