<p>将<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.ngroup.html" rel="nofollow noreferrer">^{<cd1>}</a>(在<code>0.20.2</code>中工作)与<a href="http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing" rel="nofollow noreferrer">^{<cd3>}</a>一起使用:</p>
<pre><code>df = df.sort_values(5)
print (df.groupby(5).ngroup())
0 0
1 0
4 1
2 2
3 2
dtype: int64
df = df[df.groupby(5).ngroup() < 2]
print (df)
0 1 2 3 4 5 6
0 35000 26009 OPTIDX BANKNIFTY XX 1499351400 BANKNIFTY1770621000CE
1 35001 26009 OPTIDX BANKNIFTY XX 1499351400 BANKNIFTY1770621000PE
4 35004 26009 OPTIDX BANKNIFTY XX 1499956200 BANKNIFTY1771321100CE
</code></pre>
<p>对于旧版本的pandas,使用一些hack-information隐藏在object <code>grouper.group_info</code>中,因此按<code>[0]</code>选择第一个数组:</p>
^{pr2}$
<p>带<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.factorize.html" rel="nofollow noreferrer">^{<cd6>}</a>的替代解决方案:</p>
<pre><code>df = df.sort_values(5)
df = df[pd.factorize(df[5])[0] < 2]
print (df)
0 1 2 3 4 5 6
0 35000 26009 OPTIDX BANKNIFTY XX 1499351400 BANKNIFTY1770621000CE
1 35001 26009 OPTIDX BANKNIFTY XX 1499351400 BANKNIFTY1770621000PE
4 35004 26009 OPTIDX BANKNIFTY XX 1499956200 BANKNIFTY1771321100CE
</code></pre>