<p><strong>循环非常慢。您应该在此处使用<a href="https://numpy.org/doc/stable/reference/generated/numpy.select.html" rel="nofollow noreferrer">^{<cd1>}</a>:</p>
<pre><code>In [1577]: import numpy as np
In [1578]: conditions = [df.activityType == 'joinClub', df.activityType == 'post', df.activityType == 'followuser']
In [1579]: choices = [1, 2, 3]
In [1580]: df['activity_preferance'] = np.select(conditions, choices)
In [1581]: df
Out[1581]:
activityType activity_preferance
userID
agashi1996 joinClub 1
agashi1998 post 2
agashi1998 post 2
agashi1998 post 2
agashi1994 followuser 3
</code></pre>
<h3>与其他解决方案的性能比较:</h3>
<p>我的解决方案:</p>
<pre><code>In [1582]: %timeit np.select(conditions, choices)
45.5 µs ± 1.84 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
</code></pre>
<p>@Djib2011的解决方案:</p>
<pre><code>In [1584]: %timeit df['activityType'].map(mapping)
401 µs ± 5.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
</code></pre>
<p>@JenilDave的解决方案:</p>
<pre><code>In [1590]: %timeit df.activityType.replace({'joinClub':1,'post':2,'followuser':3})
490 µs ± 20.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
</code></pre>
<p>@yashjain的解决方案:</p>
<pre><code>In [1585]: %timeit df['activityType'].apply(lambda x: 1 if x=='joinClub' else None)
114 µs ± 1.56 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
</code></pre>