<p>您可以结合groupby和resample来执行此操作。要使用重采样,需要将日期作为索引。在</p>
<pre><code>df.index = pd.to_datetime(df.Date)
df.drop('Date',axis = 1, inplace = True)
</code></pre>
<p>然后:</p>
^{pr2}$
<p>在这个例子中,我用了6个月的周期。请注意,它将在每个月的最后一天,我希望这不会是一个问题。
然后你将得到:</p>
<pre><code> Employee Date Salary
0 PersonA 2016-01-31 $50000
1 PersonB 2014-03-31 $65000
2 PersonB 2014-09-30 $65000
3 PersonB 2015-03-31 $75000
4 PersonB 2015-09-30 $75000
5 PersonB 2016-03-31 $100000
6 PersonC 2010-05-31 $75000
7 PersonC 2010-11-30 $75000
8 PersonC 2011-05-31 $75000
9 PersonC 2011-11-30 $100000
10 PersonC 2012-05-31 $110000
11 PersonC 2012-11-30 $130000
12 PersonC 2013-05-31 $150000
13 PersonC 2013-11-30 $150000
14 PersonC 2014-05-31 $200000
</code></pre>
<p>现在可以创建“months since started”列(cumcount函数检查每行在其组中出现的顺序)。记住用每个周期的月数乘以它(在本例中为6):</p>
<pre><code>df['Months since started'] = df.groupby('Employee').cumcount()*6
Employee Date Salary Months since started
0 PersonA 2016-01-31 $50000 0
1 PersonB 2014-03-31 $65000 0
2 PersonB 2014-09-30 $65000 6
3 PersonB 2015-03-31 $75000 12
4 PersonB 2015-09-30 $75000 18
5 PersonB 2016-03-31 $100000 24
6 PersonC 2010-05-31 $75000 0
7 PersonC 2010-11-30 $75000 6
8 PersonC 2011-05-31 $75000 12
9 PersonC 2011-11-30 $100000 18
10 PersonC 2012-05-31 $110000 24
11 PersonC 2012-11-30 $130000 30
12 PersonC 2013-05-31 $150000 36
13 PersonC 2013-11-30 $150000 42
14 PersonC 2014-05-31 $200000 48
</code></pre>
<p>希望有帮助!在</p>