擅长:python、mysql、java
<p>对第二个数组的长度使用<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.factorize.html" rel="nofollow noreferrer">^{<cd1>}</a>:</p>
<pre><code>a = df.apply(lambda x: len(pd.factorize(x)[1]))
print (a)
0 5
1 4
2 5
dtype: int64
</code></pre>
<p>对于整数:</p>
<pre><code>b = df.apply(lambda x: pd.factorize(x)[0])
print (b)
0 1 2
0 0 0 0
1 1 0 1
2 2 1 2
3 3 2 3
4 4 3 4
</code></pre>
<p>避免调用函数两次:</p>
<pre><code>out = {}
def f(x):
a, b = pd.factorize(x)
out[x.name] = len(b)
return a
b = df.apply(f)
print (b)
0 1 2
0 0 0 0
1 1 0 1
2 2 1 2
3 3 2 3
4 4 3 4
a = pd.Series(out)
print (a)
0 5
1 4
2 5
dtype: int64
</code></pre>