<p>似乎您可以按分组的<code>idxmin</code>进行筛选,而不考虑排序顺序,并基于此更新<code>RuleNumber</code>。您可以按如下方式使用<code>loc</code>、<code>np.where</code>、<code>mask</code>或<code>where</code>:</p>
<pre><code>df.loc[df.groupby(['PersonID', 'Name', 'RuleID'])['RuleNumber'].idxmin(), 'Label'] = 'MAIN'
</code></pre>
<p>或者在您尝试时使用<code>np.where</code>:</p>
<pre><code>df['Label'] = (np.where((df.index == df.groupby(['PersonID', 'Name', 'RuleID'])
['RuleNumber'].transform('idxmin')), 'MAIN', 'REL'))
df
Out[1]:
PersonID Name Label RuleID RuleNumber
0 1 Jan MAIN 55 3
1 1 Jan REL 55 4
2 1 Jan REL 55 5
3 2 Don MAIN 3 1
4 2 Don REL 3 2
5 2 Don REL 3 3
6 3 Joe MAIN 10 234
7 3 Joe REL 10 567
8 3 Joe REL 10 999
</code></pre>
<p>使用<code>mask</code>或其逆<code>where</code>也可以:</p>
<pre><code>df['Label'] = (df['Label'].mask((df.index == df.groupby(['PersonID', 'Name', 'RuleID'])
['RuleNumber'].transform('idxmin')), 'MAIN'))
</code></pre>
<p>或</p>
<pre><code>df['Label'] = (df['Label'].where((df.index != df.groupby(['PersonID', 'Name', 'RuleID'])
['RuleNumber'].transform('idxmin')), 'MAIN'))
</code></pre>