擅长:python、mysql、java
<p>您可以编写一个函数来执行此操作:</p>
<pre><code>import numpy as np
def uniqueID(x):
y = x[1:] != x[:-1]
i = np.r_[np.where(y)[0], x.size-1]
run_len,vals = np.diff(np.r_[-1, i]), x[i]
cnt = np.unique(vals,return_counts=True)
seq = np.concatenate([range(j) for i,j in zip(*cnt)])+1
return np.repeat(seq[vals.argsort().argsort()],run_len)
df.assign(new=uniqueID(df.user.values)).sort_values('user')
Out:
user time prod new
0 a 1.0 k 1
1 a 1.1 k 1
3 a 1.2 t 2
5 a 1.4 z 3
2 b 1.2 t 1
4 b 1.3 y 2
6 b 1.4 x 3
</code></pre>
<p>此函数仅在<code>user</code>上工作:</p>
<pre><code> a.assign(new=uniqueID(a.user.values))
Out[460]:
user new
0 2 1
1 2 1
2 2 1
3 1 1
4 1 1
5 1 1
6 3 1
7 3 1
8 1 2
9 2 2
10 2 2
11 1 3
12 4 1
13 3 2
14 3 2
15 1 4
16 1 4
17 3 3
18 2 3
</code></pre>