序列除以NaN/0中的标量结果

2024-05-03 06:08:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个按地区分组的系列->;犯罪类型->;罪案:

PdDistrict  Category                   
BAYVIEW     ASSAULT                        8976
            BURGLARY                       2891
            DISORDERLY CONDUCT              207
            DRIVING UNDER THE INFLUENCE     188
            DRUG/NARCOTIC                  2061
                                           ... 
TENDERLOIN  STOLEN PROPERTY                 299
            TRESPASS                        665
            VANDALISM                      1710
            VEHICLE THEFT                   661
            WEAPON LAWS                     791
Name: IncidntNum, Length: 140, dtype: int64

我的目标是用标量除以每个值

我尝试使用一个循环遍历“PdDistricts”,并运行以下行:

series[district] = series[district] / sum(series[district])

如果我只运行series[district] / sum(series[district]),则输出符合预期:

 Category
ASSAULT                        0.11434063
BURGLARY                       0.09323762
DISORDERLY CONDUCT             0.00427552
DRIVING UNDER THE INFLUENCE    0.00478544
DRUG/NARCOTIC                  0.05691535
DRUNKENNESS                    0.00596219
LARCENY/THEFT                  0.46712952
PROSTITUTION                   0.00027457
ROBBERY                        0.02753589
STOLEN PROPERTY                0.00917863
TRESPASS                       0.01247352
VANDALISM                      0.09335530
VEHICLE THEFT                  0.09884679
WEAPON LAWS                    0.01168902
Name: IncidntNum, dtype: float64

但是,当我尝试更新运行series[district] = series[district] / sum(series[district])的系列的现有部分时,我得到:

 Category
ASSAULT                        0
BURGLARY                       0
DISORDERLY CONDUCT             0
DRIVING UNDER THE INFLUENCE    0
DRUG/NARCOTIC                  0
DRUNKENNESS                    0
LARCENY/THEFT                  0
PROSTITUTION                   0
ROBBERY                        0
STOLEN PROPERTY                0
TRESPASS                       0
VANDALISM                      0
VEHICLE THEFT                  0
WEAPON LAWS                    0
Name: IncidntNum, dtype: int64

这并不是我们想要的。如果我使用.loc,我只得到NaN而不是0

我根本无法理解出了什么问题,我尝试过的所有解决方案都失败了,我认为关键问题是我不完全理解如何处理《熊猫》系列

我希望你能帮助我理解这个问题

/米克尔


Tags: theseriesundercategoryassaultdistrictdrivingconduct
1条回答
网友
1楼 · 发布于 2024-05-03 06:08:51

我相信每个第一级需要^{}{}-对于每个第一级MultiIndex的和值:

s1 = s.sum(level=0)
print (s1)
PdDistrict
BAYVIEW       14323
TENDERLOIN     4126
Name: IncidntNum, dtype: int64

然后用第一级除以^{},所以除以PdDistrict的和:

s2 = s.div(s1, level=0)
print (s2)
PdDistrict  Category                   
BAYVIEW     ASSAULT                        0.626684
            BURGLARY                       0.201843
            DISORDERLY CONDUCT             0.014452
            DRIVING UNDER THE INFLUENCE    0.013126
            DRUG/NARCOTIC                  0.143894
TENDERLOIN  STOLEN PROPERTY                0.072467
            TRESPASS                       0.161173
            VANDALISM                      0.414445
            VEHICLE THEFT                  0.160204
            WEAPON LAWS                    0.191711
Name: IncidntNum, dtype: float64

相关问题 更多 >