擅长:python、mysql、java
<p>我想我的方法和斯蒂芬·劳赫的方法相似,唯一的区别是我标准化/规范化了每个组的<code>prices</code>。在</p>
<pre><code># Standardize or normalize the `Prices` per `ProductFamily` (absolute value)
df_std = df.groupby('ProductFamily').transform(lambda x: np.abs((x - x.mean()) / x.std()))
# We assume that any Price beyond one standard deviation is an outlier
outlier_mask = df_std['Prices'] > 1.0
# Split clean and outlier dataframes
df_clean = df[~outlier_mask]
df_outlier = df[outlier_mask]
</code></pre>