基于列值pandas聚合行

2024-09-27 19:10:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下格式的数据:

Company     Region  Category    Metric    Year  Month   Value
Industry    Total   NARTD   Sales Value   2017  Jan     1.448129e+09
Industry    Total   NARTD   Sales Volume  2017  Jan     3.573664e+08
Industry    Total   NARTD   Sales Value   2018  Jan     1.422279e+09
Industry    Total   NARTD   Sales Volume  2018  Jan     3.492432e+08

我想在末尾添加另一列,每行的销售额/销售额与其他列数据相同,除了年份。需要对同一年同一年的销售额和销售量进行汇总。在

输出:

^{pr2}$

案件

Region  Category    Company     Metric      Year    Month   Value
Convenience     NARTD   TCC     Sales Value 2018    Dec     NaN
Traditional     NARTD   TCC     Sales Value 2018    Dec     NaN
Total           NARTD   TCC     Sales Value 2018    Dec     NaN
Hyper/Super     NARTD   TCC     Sales Value 2018    Dec     NaN

Tags: 数据valuenanmetricregioncompanyjandec
1条回答
网友
1楼 · 发布于 2024-09-27 19:10:22

IIUC,最简单的方法是先将每年的所有值都放在一行上,然后进行除法,然后熔化以重塑原始框架:

piv = df.pivot_table(index=['Company', 'Region', 'Category', 
              'Year', 'Month'], columns=['Metric'], values='Value').reset_index()
piv['AVG'] = piv['SalesValue'] / piv['SalesVolume']

piv.melt(id_vars=['Company', 'Region', 'Category', 
              'Year', 'Month', 'AVG'])

#     Company Region Category      ...            AVG       Metric         value
# 0  Industry  Total    NARTD      ...       4.052225   SalesValue  1.448129e+09
# 1  Industry  Total    NARTD      ...       4.072460   SalesValue  1.422279e+09
# 2  Industry  Total    NARTD      ...       4.052225  SalesVolume  3.573664e+08
# 3  Industry  Total    NARTD      ...       4.072460  SalesVolume  3.492432e+08

相关问题 更多 >

    热门问题