python中同一列的sum()结果不同

2024-09-29 00:16:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我在数据帧上的进程摘要

  1. 我得到了‘damagedealed’栏的总数是581294667.8516002

train.damageDealt.sum()
# 581294667.8516002

train.damageDealt.shape
# (4446966,)
  1. 我发现在单个列“winPlacePerc”中有一个NaN值

train.isnull().sum()
        Id                 0
    groupId            0
    matchId            0
    assists            0
    boosts             0
    damageDealt        0
    DBNOs              0
    headshotKills      0
    heals              0
    killPlace          0
    killPoints         0
    kills              0
    killStreaks        0
    longestKill        0
    matchDuration      0
    matchType          0
    maxPlace           0
    numGroups          0
    rankPoints         0
    revives            0
    rideDistance       0
    roadKills          0
    swimDistance       0
    teamKills          0
    vehicleDestroys    0
    walkDistance       0
    weaponsAcquired    0
    winPoints          0
    winPlacePerc       1
    dtype: int64

  1. 具有NaN的行中“damageDeath”列的值为0.0

train[train.winPlacePerc.isnull() == True].damageDealt
#        2744604    0.0
#    Name: damageDealt, dtype: float64

  1. 我用dropna()删除了那个元组

train2 = train.copy()
train2.dropna(inplace=True)
train2[train2.winPlacePerc.isnull() == True]
# Series([], Name: damageDealt, dtype: float64)

  1. 列的总和更改为581294667.8516004。。。!甚至提高

train2.damageDealt.sum()
# 581294667.8516004

所以我不知道当只有0.0元组的damagedeposed列被删除时这个结果是怎么来的。 如果有人能解释这一点,那会很有帮助的。 提前谢谢


Tags: 数据nametrue进程trainnan元组sum