<p>可以考虑在groupby中使用transform函数。如果你的数据有点像这样:</p>
<pre><code>import pandas as pd
sweep = ["sweep1", "sweep1", "sweep1", "sweep1",
"sweep2", "sweep2", "sweep2", "sweep2",
"sweep3", "sweep3", "sweep3", "sweep3",
"sweep4", "sweep4", "sweep4", "sweep4"]
Time = [0.009845, 0.002186, 0.006001, 0.00265,
0.003832, 0.005627, 0.002625, 0.004159,
0.00388, 0.008107, 0.00813, 0.004813,
0.003205, 0.003225, 0.00413, 0.001202]
Primary = [-2832.013203, -2478.839133, -2100.671551, -2057.188346,
-2605.402055, -2030.195497, -2300.209967, -2504.817095,
-2865.320903, -2456.0049, -2542.132906, -2405.657053,
-2780.140743, -2351.743053, -2232.340363, -2820.27356]
s_count = [ 0, 1, 2, 3,
0, 1, 2, 3,
0, 1, 2, 3,
0, 1, 2, 3]
df = pd.DataFrame({ 'Time' : Time,
'Primary' : Primary}, index = [sweep, s_count])
</code></pre>
<p>然后您可以编写一个非常简单的转换函数,它将为每一组数据(按扫描索引分组)返回“Primary”最小值所在的行。这可以用简单的布尔切片来实现。应该是这样的:</p>
^{pr2}$
<p>然后要使用此函数,只需在<code>transform</code>方法中调用它:</p>
<pre><code>df.groupby(level = 0).transform(trans_function)
</code></pre>
<p>这给了我以下的输出:</p>
<pre><code> Primary Time
sweep1 0 -2832.013203 0.009845
sweep2 0 -2605.402055 0.003832
sweep3 0 -2865.320903 0.003880
sweep4 3 -2820.273560 0.001202
</code></pre>
<p>显然,如果您需要的话,您可以将它合并到对数据的某个子集起作用的函数中。在</p>
<p>另一种方法是使用<code>argmin()</code>函数为组编制索引。我试图用transform来实现这个目的,但它只是返回整个数据帧。我不知道为什么会这样,但是它确实可以与<code>apply</code>一起工作:</p>
<pre><code>def trans_function2(df):
return df.loc[df['Primary'].argmin()]
df.groupby(level = 0).apply(trans_function2)
</code></pre>
<p>这又给了我:</p>
<pre><code> Primary Time
sweep1 -2832.013203 0.009845
sweep2 -2605.402055 0.003832
sweep3 -2865.320903 0.003880
sweep4 -2820.273560 0.001202
</code></pre>
<p>我不太清楚为什么这个函数不能与<code>transform</code>一起工作,也许有人会启发我们。在</p>