将第一个非零列值标记为1，其余0标记为多个列问题的回答

将第一个非零列值标记为1，其余0标记为多个列

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

从一个简单的标志开始，确定是否设置了该值 <pre><code>df = df.assign(FLAG=df.Value.where(df.Value == 0, 1)) df # Grp Org1 Org2 Value FLAG # 0 1 x a 0 0 # 1 1 x a 0 0 # 2 1 y b 3 1 # 3 1 y b 1 1 # 4 2 z c 0 0 # 5 2 y b 1 1 # 6 2 z c 0 0 # 7 2 z c 5 1 # 8 3 x a 0 0 # 9 3 y b 0 0 # 10 3 y b 0 0 # 11 4 z c 1 1 # 12 4 x a 1 1 # 13 4 x a 1 1 </code></pre> 然后，使用<code>groupby</code>在每个组中独立工作，您可以找到通过使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.cummax.html" rel="nofollow noreferrer">pd.Series.cummax</a>后跟<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.diff.html" rel="nofollow noreferrer">pd.Series.diff</a>设置的第一个标志 <pre><code>flag = df.groupby(['Grp', 'Org1', 'Org2'])['FLAG'].transform(lambda x: x.cummax().diff()) df['FLAG'] = flag.where(flag.notnull(), df['FLAG']).astype(int) df # Grp Org1 Org2 Value FLAG # 0 1 x a 0 0 # 1 1 x a 0 0 # 2 1 y b 3 1 # 3 1 y b 1 0 # 4 2 z c 0 0 # 5 2 y b 1 1 # 6 2 z c 0 0 # 7 2 z c 5 1 # 8 3 x a 0 0 # 9 3 y b 0 0 # 10 3 y b 0 0 # 11 4 z c 1 1 # 12 4 x a 1 1 # 13 4 x a 1 0 </code></pre> 使用<code>cummax</code>将把第一个<code>1</code>条目之后的所有内容也转换为<code>1</code>，这样<code>diff</code>将是所有的<code>0</code>，除了从<code>0</code>到<code>1</code>的第一步

将第一个非零列值标记为1，其余0标记为多个列

1 个回答

相关Python问题