擅长:python、mysql、java
<p>使用<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html" rel="nofollow noreferrer">^{<cd1>}</a>和<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.DataFrameGroupBy.filter.html" rel="nofollow noreferrer">^{<cd2>}</a>,非常简洁,并用于此目的。你知道吗</p>
<pre><code>df1 = df.groupby('user_id').filter(lambda x: len(x) > 100)
</code></pre>
<hr/>
<p>要获得更好的性能,请使用<a href="https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.unique.html" rel="nofollow noreferrer">^{<cd3>}</a>和<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html" rel="nofollow noreferrer">^{<cd4>}</a>:</p>
<pre><code>m = dict(zip(*np.unique(df.user_id, return_counts=True)))
df[df['user_id'].map(m) > 100]
</code></pre>