PythonPandas：如何仅对多个列中大于1000的数据求平均值？问题的回答

PythonPandas：如何仅对多个列中大于1000的数据求平均值？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

通过<code>filter()</code>、<code>gt()</code>和<code>mean()</code>尝试： <pre><code>out=df.filter(like='Revenue') df['Average']=out[out.gt(1000)].mean(axis=1) </code></pre> <code>df</code>的输出： <pre><code> Name Revenue1 Revenue2 Revenue3 Average 0 Peter 1000 2000 3000 2500.0 1 Jane 9000 10000 5000 8000.0 </code></pre> 代码分解： 首先，我们通过<code>filter()</code>方法选择所有名为“Revenue”的列，它将给出一个名为“Revenue”的列的数据帧，因此我们将其存储在out变量中 <pre><code>out=df.filter(like='Revenue') #output of above code: Revenue1 Revenue2 Revenue3 0 1000 2000 3000 1 9000 10000 5000 </code></pre> 然后我们将筛选出值大于1000的行： <pre><code>out.gt(1000) #your condition #output of above code: Revenue1 Revenue2 Revenue3 0 False True True 1 True True True </code></pre> 如您所见，条件为您提供布尔值，因此现在我们将把该布尔掩码传递给out，以便在上述布尔值中有<code>True</code>的地方，您将获得值，在有<code>False</code>的地方，您将获得<code>NaN</code>这称为布尔掩码： <pre><code>out[out.gt(1000)] #output of above code: Revenue1 Revenue2 Revenue3 0 NaN 2000 3000 1 9000.0 10000 5000 </code></pre> 最后，我们将通过<code>axis=1</code>上的<code>mean()</code>方法计算平均值，因此<code>NaN's</code>被忽略。换句话说<code>mean()</code>方法在计算平均值时不考虑<code>NaN's</code>（因为默认情况下<code>skipna=None</code>在<code>mean()</code>方法中） <pre><code>out[out.gt(1000)].mean(axis=1) #output of above code: 0 2500.0 1 8000.0 dtype: float64 </code></pre> 最后，我们将此结果分配回df： <pre><code>df['Average']=out[out.gt(1000)].mean(axis=1) </code></pre> 更新： 如果<code>df</code>还有其他数字列，如“收入”，并且您还希望将其与“收入”等列一起包含在计算中，则使用： <pre><code>out=df.filter(regex='Revenue|Income') df['Average']=out[out.gt(1000)].mean(axis=1) </code></pre>

PythonPandas：如何仅对多个列中大于1000的数据求平均值？

1 个回答

相关Python问题