擅长:python、mysql、java
<p>我认为这里应该有一点修改拉斐尔的答案,每一组向前填充:</p>
<pre><code>df['uniq_ebe'] = (df.drop_duplicates(['user', 'prod'])
.groupby('user')['prod']
.cumcount()
.add(1)
.reindex(df.index)
.groupby(df['user'])
.ffill()
.astype(int))
print (df)
user time prod uniq_ebe
0 a 1.0 k 1
1 a 1.1 k 1
2 b 1.2 t 1
3 a 1.2 t 2
4 b 1.3 y 2
5 a 1.3 k 2
6 a 1.3 z 3
7 b 1.3 x 3
</code></pre>