当您需要为每个数据框选择不同的列时，如何更改数据框值问题的回答

当您需要为每个数据框选择不同的列时，如何更改数据框值

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

用途： <pre><code>df = df.mask(df.cumsum(axis=1).ge(1).cumsum(axis=1).isin([2,3,4]), 0) print (df) W1 W2 W3 W4 W5 W6 W7 W8 0 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 2 0 1 0 0 0 1 0 0 3 1 0 0 0 1 1 0 1 </code></pre> 解释： 每行使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.cumsum.html" rel="nofollow noreferrer">^{<cd1>}</a>： <pre><code>print (df.cumsum(axis=1)) W1 W2 W3 W4 W5 W6 W7 W8 0 0 0 1 1 2 3 4 5 1 0 0 1 1 1 2 3 4 2 0 1 1 1 2 3 3 3 3 1 1 1 1 2 3 3 4 </code></pre> 通过<code>>=1</code>与<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.ge.html" rel="nofollow noreferrer">^{<cd3>}</a>通信： <pre><code>print (df.cumsum(axis=1).ge(1)) W1 W2 W3 W4 W5 W6 W7 W8 0 False False True True True True True True 1 False False True True True True True True 2 False True True True True True True True 3 True True True True True True True True </code></pre> 再次<code>cumsum</code>通过boolen mask： <pre><code>print (df.cumsum(axis=1).ge(1).cumsum(axis=1)) W1 W2 W3 W4 W5 W6 W7 W8 0 0 0 1 2 3 4 5 6 1 0 0 1 2 3 4 5 6 2 0 1 2 3 4 5 6 7 3 1 2 3 4 5 6 7 8 </code></pre> 通过<code>2,3,4</code>比较下3个值，并首先忽略： <pre><code>print (df.cumsum(axis=1).ge(1).cumsum(axis=1).isin([2,3,4])) W1 W2 W3 W4 W5 W6 W7 W8 0 False False False True True True False False 1 False False False True True True False False 2 False False True True True False False False 3 False True True True False False False False </code></pre> 如果要定义<code>n</code>和<code>DIFF</code>值，请使用更动态的解决方案： <pre><code>df = pd.DataFrame({'W1': [0, 0, 0, 0], 'W2': [0, 0, 1, 0], 'W3': [1, 1, 0, 0], 'W4': [0, 0, 0, 0], 'W5': [1, 0, 1, 0], 'W6': [1, 1, 1, 0], 'W7': [1, 1, 0, 0], 'W8': [1, 1, 0, 1]}) print (df) W1 W2 W3 W4 W5 W6 W7 W8 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 2 0 1 0 0 1 1 0 0 3 0 0 0 0 0 0 0 1 </code></pre> <hr/> <pre><code>DIFF = 4 n = 3 #select columns for check by positions subset = df.iloc[:, :n] #replace 0 to NaNs replace back filling, change order of columns with cumsum last_1 = subset.mask(subset == 0).bfill(axis=1).iloc[:, ::-1].cumsum(axis=1) print (last_1) W3 W2 W1 0 1.0 2.0 3.0 1 1.0 2.0 3.0 2 NaN 1.0 2.0 3 NaN NaN NaN #add missing columns and create ones rows by forward filling df1 = last_1.reindex(index=df.index, columns=df.columns).ffill(axis=1) print (df1) W1 W2 W3 W4 W5 W6 W7 W8 0 3.0 2.0 1.0 1.0 1.0 1.0 1.0 1.0 1 3.0 2.0 1.0 1.0 1.0 1.0 1.0 1.0 2 2.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 3 NaN NaN NaN NaN NaN NaN NaN NaN #compare by 1 and get cumsum print (df1.eq(1).cumsum(axis=1)) W1 W2 W3 W4 W5 W6 W7 W8 0 0 0 1 2 3 4 5 6 1 0 0 1 2 3 4 5 6 2 0 1 2 3 4 5 6 7 3 0 0 0 0 0 0 0 0 </code></pre> <hr/> <pre><code>#last check range of values df = df.mask(df1.eq(1).cumsum(axis=1).isin(range(2, DIFF + 2)), 0) print (df) W1 W2 W3 W4 W5 W6 W7 W8 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 2 0 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 1 </code></pre>

当您需要为每个数据框选择不同的列时，如何更改数据框值

1 个回答

相关Python问题