<p>我将创建一个新变量'new'</p>
<pre><code>df['New']=df.revenue
df.loc[df['cohort_period']==0,'New']=np.nan
df['cumulative_revenue']=df.groupby('account_id')['New'].cumsum()
df
Out[63]:
account_id cohort_period company revenue New \
0 111 0 initech 3.67 NaN
1 111 1 initech 9.95 9.95
2 111 2 initech 9.95 9.95
3 222 0 jackson steinem & co 193.29 NaN
4 222 1 jackson steinem & co 299.95 299.95
5 333 0 ingen 83.03 NaN
6 333 1 ingen 499.95 499.95
7 333 2 ingen 99.95 99.95
8 666 0 enron 1.52 NaN
9 666 1 enron 19.95 19.95
cumulative_revenue
0 NaN
1 9.95
2 19.90
3 NaN
4 299.95
5 NaN
6 499.95
7 599.90
8 NaN
9 19.95
</code></pre>
<p>或<code>mask</code></p>
<pre><code>df.groupby('account_id').apply(lambda x :x['revenue'].mask(x['cohort_period'].eq(0)).cumsum())
</code></pre>