在pandas数据框中为每行计算百分比

country_name country_code val_code \ United States of America 231 1 United States of America 231 2 United States of America 231 3 United States of America 231 4 United States of America 231 5 y191 y192 y193 y194 y195 \ 47052179 43361966 42736682 43196916 41751928 1187385 1201557 1172941 1176366 1192173 28211467 27668273 29742374 27543836 28104317 179000 193000 233338 276639 249688 12613922 12864425 13240395 14106139 15642337

2条回答

网友

1楼 · 编辑于 2024-06-04 02:17:38

对所有感兴趣的列的总计进行Ge运算，然后添加百分比列：

In [35]:
total = np.sum(df.ix[:,'y191':].values)
df['percent'] = df.ix[:,'y191':].sum(axis=1)/total * 100
df

Out[35]:
               country_name  country_code  val_code      y191      y192  \
0  United States of America           231         1  47052179  43361966   
1  United States of America           231         1   1187385   1201557   
2  United States of America           231         1  28211467  27668273   
3  United States of America           231         1    179000    193000   
4  United States of America           231         1  12613922  12864425   

       y193      y194      y195    percent  
0  42736682  43196916  41751928  50.149471  
1   1172941   1176366   1192173   1.363631  
2  29742374  27543836  28104317  32.483447  
3    233338    276639    249688   0.260213  
4  13240395  14106139  15642337  15.743237

因此np.sum将对所有值求和：

In [32]:
total = np.sum(df.ix[:,'y191':].values)
total

Out[32]:
434899243

然后，我们调用感兴趣列上的.sum(axis=1)/total * 100按行求和，除以总和，再乘以100得到一个百分比。

网友

2楼 · 编辑于 2024-06-04 02:17:38

使用lambda函数可以获得每个列的百分比，如下所示：

>>> df.iloc[:, 3:].apply(lambda x: x / x.sum())
       y191      y192      y193      y194      y195
0  0.527231  0.508411  0.490517  0.500544  0.480236
1  0.013305  0.014088  0.013463  0.013631  0.013713
2  0.316116  0.324405  0.341373  0.319164  0.323259
3  0.002006  0.002263  0.002678  0.003206  0.002872
4  0.141342  0.150833  0.151969  0.163455  0.179920

您的示例没有任何重复的val_code值，因此我不确定您希望如何显示数据（即，显示每个vval_code组的列中的合计百分比与合计百分比）

相关问题更多 >

编程相关推荐

热门问题

热门文章