这是我的数据帧的一个小段
city trips_in_first_30_days bins
0 King's Landing 4 (3, 125]
1 Astapor 0 NaN
2 Astapor 3 (2, 3]
3 King's Landing 9 (3, 125]
4 Winterfell 14 (3, 125]
5 Winterfell 2 (1, 2]
6 Astapor 1 (0, 1]
7 Winterfell 2 (1, 2]
8 Winterfell 2 (1, 2]
9 Winterfell 1 (0, 1]
10 Winterfell 1 (0, 1]
11 Winterfell 3 (2, 3]
12 Winterfell 1 (0, 1]
13 King's Landing 0 NaN
14 Astapor 1 (0, 1]
15 Winterfell 1 (0, 1]
16 King's Landing 1 (0, 1]
17 King's Landing 0 NaN
18 King's Landing 6 (3, 125]
19 King's Landing 0 NaN
20 Winterfell 1 (0, 1]
21 Astapor 1 (0, 1]
22 Winterfell 0 NaN
23 King's Landing 0 NaN
24 Astapor 4 (3, 125]
25 Winterfell 1 (0, 1]
26 Astapor 1 (0, 1]
27 Winterfell 3 (2, 3]
28 Winterfell 0 NaN
29 Astapor 1 (0, 1]
... ... ... ...
49970 Winterfell 2 (1, 2]
49971 King's Landing 0 NaN
49972 Winterfell 1 (0, 1]
49973 Astapor 2 (1, 2]
49974 Winterfell 1 (0, 1]
49975 Winterfell 11 (3, 125]
49976 King's Landing 0 NaN
49977 Astapor 4 (3, 125]
49978 Winterfell 1 (0, 1]
49979 Winterfell 0 NaN
49980 Astapor 1 (0, 1]
49981 Astapor 0 NaN
49982 King's Landing 0 NaN
49983 Winterfell 1 (0, 1]
49984 Winterfell 1 (0, 1]
49985 Astapor 1 (0, 1]
49986 Winterfell 0 NaN
49987 Winterfell 3 (2, 3]
49988 King's Landing 1 (0, 1]
49989 Winterfell 1 (0, 1]
49990 Astapor 1 (0, 1]
49991 Winterfell 0 NaN
49992 King's Landing 1 (0, 1]
49993 Astapor 3 (2, 3]
49994 Astapor 1 (0, 1]
49995 King's Landing 0 NaN
49996 Astapor 1 (0, 1]
49997 Winterfell 0 NaN
49998 Astapor 2 (1, 2]
49999 Astapor 0 NaN
df['bins']
是我介绍的一个范畴,我用pd.cut
把trips_in_first_30_days
放在不同的容器中。在
现在我有兴趣了解一下,当按城市分组时,不同垃圾箱中的trips_in_first_30_days
的百分比是多少?在
例如,城市asapor有多少百分比trips_in_first_30_days
属于(0,1];有多少属于(1,2]),依此类推?在
是否有可能做到这一点,因为bin是dtype category并且不能对其执行操作?如何做到这一点呢?在
编辑:
在尝试建议的解决方案时:
^{pr2}$输出如下:
bins (0, 1] (1, 2] (2, 3] (3, 125]
city
Astapor 31.105601 14.787710 6.973509 14.878432
King's Landing 22.408687 14.471866 7.541955 20.710760
Winterfell 28.689578 14.959719 8.017655 20.371957
每个城市的百分比之和应该是100,但事实并非如此
为此,记住在
groupby
的apply
中使用的函数可以返回pd.Series
对象(在Pandas文档中称为flexible apply)。在请尝试以下代码:
它分两步来完成这项工作——首先按城市划分数据,然后为每个城市计算每个垃圾箱的百分比。在
结果应该是以cities为行,bin为列的表。在
相关问题 更多 >
编程相关推荐