查找每行中特定列的最大值，并将相应列的值与另一列的值相匹配问题的回答

查找每行中特定列的最大值，并将相应列的值与另一列的值相匹配

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

希望我得到了正确的条件；下面的计算依赖于赋值前的索引；因此，我创建了一个长数据帧，创建了长数据帧的条件，将大小减少到唯一索引，并将结果分配给原始数据帧： 创建一个长数据框来计算条件（为了方便起见，我在这里使用<a href="https://pyjanitor-devs.github.io/pyjanitor/" rel="nofollow noreferrer">pyjanitor</a><a href="https://pyjanitor-devs.github.io/pyjanitor/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer" rel="nofollow noreferrer">pivot_longer</a>对其进行重塑；您可以在不使用PyGatitor的情况下通过几个步骤来完成这项工作）： <pre class="lang-py prettyprint-override"><code># pip install git+https://github.com/pyjanitor-devs/pyjanitor.git import janitor as jn import pandas as pd res = (df.pivot_longer(['ID', 'Hour'], names_pattern= [r'.+\d$', r'.+Score$'], names_to = ['cus', 'score'], ignore_index = False) ) print(res) ID Hour cus score 0 1 11 8 0.62 1 2 13 15 0.57 2 3 16 9 0.67 3 4 8 11 0.58 0 1 11 10 0.59 1 2 13 16 0.57 2 3 16 14 0.54 3 4 8 8 0.55 0 1 11 11 0.59 1 2 13 18 0.57 2 3 16 16 0.48 3 4 8 12 0.43 0 1 11 14 0.54 1 2 13 13 0.57 2 3 16 12 0.34 3 4 8 17 0.25 </code></pre> 为<code>match1</code>创建条件： <pre class="lang-py prettyprint-override"><code># get booleans for max and no duplicates res = res.sort_index() max1 = res.groupby(level=0).score.transform('max') # max without duplicates cond1 = res.score.eq(max1).groupby(level=0).sum().eq(1) cond1 = cond1 & df.Hour.eq(df.Cus1) # max with duplicates cond2 = res.score.eq(max1).groupby(level=0).transform('sum').gt(1) cond2 &= res.Hour.eq(res.cus) cond2 = cond2.groupby(level=0).any() df['match1'] = np.where(cond1 | cond2, 'YES', 'NO') </code></pre> 匹配2： <pre class="lang-py prettyprint-override"><code> second_largest = (res.sort_values('score', ascending=False) .groupby(level=0, sort = False) .score .transform('nth', 1) ) second_largest = second_largest.sort_index() # second largest without duplicates cond_1 = res.score.eq(second_largest).groupby(level=0).sum().eq(1) cond_1 &= df.Hour.eq(df.Cus2) # second largest with duplicates cond_2 = res.score.eq(second_largest).groupby(level=0).transform('sum').gt(1) cond_2 &= res.Hour.eq(res.cus) cond_2 = cond_2.groupby(level=0).any() df['match2'] = np.where(cond_1 | cond_2, 'YES', 'NO') </code></pre> <pre class="lang-py prettyprint-override"><code>df ID Hour Cus1 Cus2 Cus3 Cus4 Cus1_Score Cus2_Score Cus3_Score Cus4_Score match1 match2 0 1 11 8 10 11 14 0.62 0.59 0.59 0.54 NO YES 1 2 13 15 16 18 13 0.57 0.57 0.57 0.57 YES YES 2 3 16 9 14 16 12 0.67 0.54 0.48 0.34 NO NO 3 4 8 11 8 12 17 0.58 0.55 0.43 0.25 NO YES </code></pre>

查找每行中特定列的最大值，并将相应列的值与另一列的值相匹配

1 个回答

相关Python问题