<p>首先按<a href="http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing" rel="nofollow noreferrer">^{<cd1>}</a>筛选,然后仅为<code>supply</code>列聚合<code>sum</code>,由于可能会筛选出一些<code>id</code>值,所以按原始列的<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.unique.html" rel="nofollow noreferrer">^{<cd6>}</a>值添加<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reindex.html" rel="nofollow noreferrer">^{<cd5>}</a>。将<code>Series</code>转换为<code>DataFrame</code>的最后一个<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reset_index.html" rel="nofollow noreferrer">^{<cd7>}</a>,并为提取<code>supply</code>添加新的<code>y</code>列和<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pop.html" rel="nofollow noreferrer">^{<cd11>}</a>:</p>
<pre><code>df1 = df[(df.days > 0)&(df.days<=60)]
df2=df1.groupby('id')['supply'].sum().reindex(df['id'].unique(), fill_value=-1).reset_index()
df2['y'] = np.where(df2.pop('supply') > 100, 1, 0)
print (df2)
id y
0 1 1
1 2 0
2 3 0
</code></pre>
<p>编辑:如果需要删除筛选出<code>id</code>行:</p>
<pre><code>df1 = df[(df.days > 0)&(df.days<=60)]
df2=df1.groupby('id', as_index=False)['supply'].sum()
df2['y'] = np.where(df2.pop('supply') > 100, 1, 0)
print (df2)
id y
0 1 1
1 2 0
</code></pre>
<p>替代解决方案:</p>
<pre><code>df2 = (df.query("0 < days <=60")
.groupby('id')['supply'].sum()
.reindex(df['id'].unique(), fill_value=-1)
.rename('y')
.gt(100)
.astype(int)
.reset_index()
)
print (df2)
id y
0 1 1
1 2 0
2 3
</code></pre>
<hr/>
<pre><code>df2 = (df.query("0 < days <=60")
.groupby('id')['supply'].sum()
.rename('y')
.gt(100)
.astype(int)
.reset_index()
)
print (df2)
id y
0 1 1
1 2 0
</code></pre>