<p>使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.DataFrameGroupBy.transform.html#pandas-core-groupby-dataframegroupby-transform" rel="nofollow noreferrer">^{<cd1>}</a>+<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html#pandas-dataframe-drop-duplicates" rel="nofollow noreferrer">^{<cd2>}</a>+<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reset_index.html#pandas-dataframe-reset-index" rel="nofollow noreferrer">^{<cd3>}</a>:</p>
<pre><code>cols = ["COD", "TEC", "SET"]
df_cut = (
df[df['AZIM'].eq(
df.groupby(cols)['AZIM'].transform(lambda x: x.mode().max())
)].drop_duplicates(cols).reset_index(drop=True)
)
</code></pre>
<p><code>df_cut</code>:</p>
<pre><code> COD STATE CITY AZIM SET TEC
0 ALAAD_0001 AL MAC 0 1 4
1 ALAAD_0001 AL ARA 120 2 4
2 ALAAD_0001 AL MAC 240 3 4
3 BAPID_0001 BA SAL 20 1 2
4 BAPID_0001 BA VIT 100 2 2
5 BAPID_0001 BA SAL 250 3 2
6 CEMBC_0003 CE FOR 90 1 4
7 CEMBC_0003 CE CAU 160 2 4
8 CEMBC_0003 CE FOR 280 3 4
</code></pre>
<hr/>
<p>说明:</p>
<p><a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.DataFrameGroupBy.transform.html#pandas-core-groupby-dataframegroupby-transform" rel="nofollow noreferrer">^{<cd1>}</a>将mode max放置在每个组的末尾:</p>
<pre><code>df.groupby(["COD", "TEC", "SET"])['AZIM'].transform(lambda x: x.mode().max())
</code></pre>
<pre><code>0 0
1 120
2 120
3 240
4 240
5 20
6 20
7 100
8 100
9 250
10 250
11 250
12 90
13 90
14 160
15 160
16 160
17 280
Name: AZIM, dtype: int64
</code></pre>
<p>通过将其与“AZIM”列进行比较来创建布尔索引,以查找mode max所在的索引:</p>
<pre><code>df['AZIM'].eq(
df.groupby(["COD", "TEC", "SET"])['AZIM']
.transform(lambda x: x.mode().max())
)
</code></pre>
<pre><code>0 True
1 False
2 True
3 False
4 True
5 True
6 True
7 True
8 True
9 False
10 True
11 True
12 True
13 False
14 True
15 True
16 False
17 True
Name: AZIM, dtype: bool
</code></pre>
<p>这用于过滤<code>df</code>:</p>
<pre><code>df[df['AZIM'].eq(
df.groupby(["COD", "TEC", "SET"])['AZIM']
.transform(lambda x: x.mode().max())
)]
</code></pre>
<pre><code> COD STATE CITY AZIM SET TEC
0 ALAAD_0001 AL MAC 0 1 4
2 ALAAD_0001 AL ARA 120 2 4
4 ALAAD_0001 AL MAC 240 3 4
5 BAPID_0001 BA SAL 20 1 2
6 BAPID_0001 BA SAL 20 1 2
7 BAPID_0001 BA VIT 100 2 2
8 BAPID_0001 BA SAL 100 2 2
10 BAPID_0001 BA SAL 250 3 2
11 BAPID_0001 BA SAL 250 3 2
12 CEMBC_0003 CE FOR 90 1 4
14 CEMBC_0003 CE CAU 160 2 4
15 CEMBC_0003 CE FOR 160 2 4
17 CEMBC_0003 CE FOR 280 3 4
</code></pre>
<p>最后<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html#pandas-dataframe-drop-duplicates" rel="nofollow noreferrer">^{<cd2>}</a>+<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reset_index.html#pandas-dataframe-reset-index" rel="nofollow noreferrer">^{<cd3>}</a>要删除重复项并清理索引,请执行以下操作:</p>
<pre><code>df[df['AZIM'].eq(
df.groupby(["COD", "TEC", "SET"])['AZIM']
.transform(lambda x: x.mode().max())
)].drop_duplicates(["COD", "TEC", "SET"]).reset_index(drop=True)
</code></pre>
<pre><code> COD STATE CITY AZIM SET TEC
0 ALAAD_0001 AL MAC 0 1 4
1 ALAAD_0001 AL ARA 120 2 4
2 ALAAD_0001 AL MAC 240 3 4
3 BAPID_0001 BA SAL 20 1 2
4 BAPID_0001 BA VIT 100 2 2
5 BAPID_0001 BA SAL 250 3 2
6 CEMBC_0003 CE FOR 90 1 4
7 CEMBC_0003 CE CAU 160 2 4
8 CEMBC_0003 CE FOR 280 3 4
</code></pre>