在时间序列数据集上按各组计算比率

2024-06-14 08:41:57 发布

您现在位置:Python中文网/ 问答频道 /正文

前几天我在这个链接上问了这个问题:Groupby id to calculate ratios

结果表明,建议的答案并没有按唯一的iddatadate计算所需的比率。实现这一目标的任何帮助都将是巨大的!。建议的答案是:

df = df.groupby(by=[['id', 'datadate']], as_index=False).sum()

目标

我有这个df,下面是一些比率。我想通过每个id和datadate计算这些比率,我相信groupby函数是可行的,但是我不太确定。任何帮助都会很棒

df

     id   datadate    dltt   ceq  ...        pstk     icapt  dlc      sale
1  001004 1975-02-28  3.0  193.0  ...      1.012793     1    0.20    7.367237
2  001004 1975-05-31  4.0  197.0  ...      1.249831     1    0.21    8.982741
3  001004 1975-08-31  5.0  174.0  ...      1.142086     2    0.24    8.115609
4  001004 1975-11-30  8.0  974.0  ...      1.400673     3    0.26    9.944990
5  001005 1975-02-28  3.0  191.0  ...      1.012793     4    0.25    7.367237
6  001005 1975-05-31  3.0  971.0  ...      1.249831     5    0.26    8.982741
7  001005 1975-08-31  2.0  975.0  ...      1.142086     6    0.27    8.115609
8  001005 1975-11-30  1.0  197.0  ...      1.400673     3    0.27    9.944990
9  001006 1975-02-28  3.0  974.0  ...      1.012793     2    0.28    7.367237
10 001006 1975-05-31  4.0  74.0   ...      1.249831     1    0.21    8.982741
11 001006 1975-08-31  5.0  75.0   ...      1.142086     3    0.23    8.115609
12 001006 1975-11-30  5.0  197.0  ...      1.400673     4    0.24    9.944990

比率示例

df['capital_ratioa'] = df['dltt']/(df['dltt']+df['ceq']+df['pstk'])
df['equity_invcapa'] = df['ceq']/df['icapt']
df['debt_invcapa'] = df['dltt']/df['icapt']
df['sale_invcapa']=df['sale']/df['icapt']
df['totdebt_invcapa']=(df['dltt']+df['dlc'])/df['icapt'] 

Tags: 答案id目标dfsale建议比率groupby
1条回答
网友
1楼 · 发布于 2024-06-14 08:41:57

您可以按id列和datadate列进行分组:

In [1956]: res = df.groupby(['id', 'datadate'], as_index=False).sum()

In [1952]: res['capital_ratioa'] = res['dltt']/(res['dltt'] + res['ceq'] + res['pstk'])

In [1954]: res['equity_invcapa'] = res['ceq']/res['icapt']

对于其他专栏也是如此

In [1955]: df
Out[1955]: 
      id   datadate  dltt    ceq      pstk  icapt   dlc      sale  capital_ratioa  equity_invcapa
0   1004 1975-02-28   3.0  193.0  1.012793      1  0.20  7.367237        0.015227      193.000000
1   1004 1975-05-31   4.0  197.0  1.249831      1  0.21  8.982741        0.019778      197.000000
2   1004 1975-08-31   5.0  174.0  1.142086      2  0.24  8.115609        0.027756       87.000000
3   1004 1975-11-30   8.0  974.0  1.400673      3  0.26  9.944990        0.008135      324.666667
4   1005 1975-02-28   3.0  191.0  1.012793      4  0.25  7.367237        0.015384       47.750000
5   1005 1975-05-31   3.0  971.0  1.249831      5  0.26  8.982741        0.003076      194.200000
6   1005 1975-08-31   2.0  975.0  1.142086      6  0.27  8.115609        0.002045      162.500000
7   1005 1975-11-30   1.0  197.0  1.400673      3  0.27  9.944990        0.005015       65.666667
8   1006 1975-02-28   3.0  974.0  1.012793      2  0.28  7.367237        0.003067      487.000000
9   1006 1975-05-31   4.0   74.0  1.249831      1  0.21  8.982741        0.050473       74.000000
10  1006 1975-08-31   5.0   75.0  1.142086      3  0.23  8.115609        0.061620       25.000000
11  1006 1975-11-30   5.0  197.0  1.400673      4  0.24  9.944990        0.024582       49.250000

相关问题 更多 >