用Pandas找出一组动物在垃圾箱中的百分比

2024-10-01 00:15:35 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我的数据帧的一个小段

city    trips_in_first_30_days  bins
0   King's Landing  4   (3, 125]
1   Astapor 0   NaN
2   Astapor 3   (2, 3]
3   King's Landing  9   (3, 125]
4   Winterfell  14  (3, 125]
5   Winterfell  2   (1, 2]
6   Astapor 1   (0, 1]
7   Winterfell  2   (1, 2]
8   Winterfell  2   (1, 2]
9   Winterfell  1   (0, 1]
10  Winterfell  1   (0, 1]
11  Winterfell  3   (2, 3]
12  Winterfell  1   (0, 1]
13  King's Landing  0   NaN
14  Astapor 1   (0, 1]
15  Winterfell  1   (0, 1]
16  King's Landing  1   (0, 1]
17  King's Landing  0   NaN
18  King's Landing  6   (3, 125]
19  King's Landing  0   NaN
20  Winterfell  1   (0, 1]
21  Astapor 1   (0, 1]
22  Winterfell  0   NaN
23  King's Landing  0   NaN
24  Astapor 4   (3, 125]
25  Winterfell  1   (0, 1]
26  Astapor 1   (0, 1]
27  Winterfell  3   (2, 3]
28  Winterfell  0   NaN
29  Astapor 1   (0, 1]
... ... ... ...
49970   Winterfell  2   (1, 2]
49971   King's Landing  0   NaN
49972   Winterfell  1   (0, 1]
49973   Astapor 2   (1, 2]
49974   Winterfell  1   (0, 1]
49975   Winterfell  11  (3, 125]
49976   King's Landing  0   NaN
49977   Astapor 4   (3, 125]
49978   Winterfell  1   (0, 1]
49979   Winterfell  0   NaN
49980   Astapor 1   (0, 1]
49981   Astapor 0   NaN
49982   King's Landing  0   NaN
49983   Winterfell  1   (0, 1]
49984   Winterfell  1   (0, 1]
49985   Astapor 1   (0, 1]
49986   Winterfell  0   NaN
49987   Winterfell  3   (2, 3]
49988   King's Landing  1   (0, 1]
49989   Winterfell  1   (0, 1]
49990   Astapor 1   (0, 1]
49991   Winterfell  0   NaN
49992   King's Landing  1   (0, 1]
49993   Astapor 3   (2, 3]
49994   Astapor 1   (0, 1]
49995   King's Landing  0   NaN
49996   Astapor 1   (0, 1]
49997   Winterfell  0   NaN
49998   Astapor 2   (1, 2]
49999   Astapor 0   NaN

df['bins']是我介绍的一个范畴,我用pd.cuttrips_in_first_30_days放在不同的容器中。在

现在我有兴趣了解一下,当按城市分组时,不同垃圾箱中的trips_in_first_30_days的百分比是多少?在

例如,城市asapor有多少百分比trips_in_first_30_days属于(0,1];有多少属于(1,2]),依此类推?在

是否有可能做到这一点,因为bindtype category并且不能对其执行操作?如何做到这一点呢?在

编辑

在尝试建议的解决方案时:

^{pr2}$

输出如下:

bins    (0, 1]  (1, 2]  (2, 3]  (3, 125]
city                
Astapor 31.105601   14.787710   6.973509    14.878432 
King's Landing  22.408687   14.471866   7.541955    20.710760
Winterfell  28.689578   14.959719   8.017655    20.371957

每个城市的百分比之和应该是100,但事实并非如此


Tags: 数据incitydfnandaysfirstpd
1条回答
网友
1楼 · 发布于 2024-10-01 00:15:35

为此,记住在groupbyapply中使用的函数可以返回pd.Series对象(在Pandas文档中称为flexible apply)。在

请尝试以下代码:

def calc_bin_percentage(group_df):
    bins_count = group_df.groupby("bins")["trips_in_first_30_days"].sum()
    return 100 * bins_count / group_df.sum()

df.groupby("city").apply(calc_bin_percentage).unstack().fillna(0)

它分两步来完成这项工作——首先按城市划分数据,然后为每个城市计算每个垃圾箱的百分比。在

结果应该是以cities为行,bin为列的表。在

相关问题 更多 >