如何用百分比制作Pandas交叉表？

df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 6, 'B' : ['A', 'B', 'C'] * 8, 'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4, 'D' : np.random.randn(24), 'E' : np.random.randn(24)}) pd.crosstab(df.A,df.B) B A B C A one 4 4 4 three 2 2 2 two 2 2 2

3条回答

网友

1楼 · 编辑于 2024-10-05 15:18:21

我们可以用100来表示百分比：

pd.crosstab(df.A,df.B, normalize='index')\
    .round(4)*100

B          A      B      C
A                         
one    33.33  33.33  33.33
three  33.33  33.33  33.33
two    33.33  33.33  33.33

为了方便我绕了一圈。

网友

2楼 · 编辑于 2024-10-05 15:18:21

pd.crosstab(df.A, df.B).apply(lambda r: r/r.sum(), axis=1)

基本上，您只需要使用row/row.sum()函数，然后使用apply和axis=1逐行应用它。

（如果在Python 2中执行此操作，则应使用from __future__ import division确保division始终返回一个浮点数。）

网友

3楼 · 编辑于 2024-10-05 15:18:21

从Pandas 0.18.1开始，有一个normalize选项：

In [1]: pd.crosstab(df.A,df.B, normalize='index')
Out[1]:

B              A           B           C
A           
one     0.333333    0.333333    0.333333
three   0.333333    0.333333    0.333333
two     0.333333    0.333333    0.333333

可以在all、index（行）或columns之间进行正规化。

更多详细信息请参见in the documentation。

相关问题更多 >

编程相关推荐

热门问题

热门文章