Python数据帧条件和

raw_data = {'Country': ['A', 'B', 'C', 'D', 'E'], 'Region': ['X', 'X', 'X', 'Y', 'Y'], 'Income': [100, 200, 300, 100, 200] } incomeData = pd.DataFrame(raw_data, columns = ['Country', 'Region', 'Income']) regionGroup = incomeData.groupby(['Region'], as_index=False) groupCount = lambda x: x.count() #CountHighIncome = ? aggregations = { 'Country': {groupCount }, 'Income': {'min', 'max', 'mean', 'median' #, CountHighIncome } } incomeSummary = regionGroup.agg(aggregations) incomeSummary

1条回答

网友

1楼 · 发布于 2024-05-19 22:47:41

您可以使用带有sum条件的lambda的自定义函数，True的计数与{}相似，对于Country被删除{}函数且仅使用count：

CountHighIncome = lambda x: (x > 100).sum()
aggregations = {
    'Country': {'count'
    },
    'Income': {'min', 'max', 'mean', 'median',  CountHighIncome
    }
}
incomeSummary = regionGroup.agg(aggregations)
print (incomeSummary)
  Region Income                           Country
            max  min <lambda> mean median   count
0      X    300  100        2  200    200       3
1      Y    200  100        1  150    150       2

相关问题更多 >

编程相关推荐

热门问题

热门文章