我的数据集缺少以下值:
print(train.shape)
(54808, 6)
employee_id 0
name 0
education 2409
age 0
Salary_hike 4124
length_of_service 0
我想根据服务的长度(如果小于1)将缺少的薪资(如果小于1)行值填充为0
例如:
train = pd.DataFrame({'employee_id':[103,101,103,104,105,106,107,108,109,110],
'Name':['A','B','C','D','E','F','G','H','I','J'],
'Age' :[20,30,21,24,25,22,27,23,24,21],
'length_of_service':[1,2,1,4,5,1,7,1,2,1],
'Salary_hike':[np.nan,5, np.nan, 6, 7,1,9,1,4,np.nan] ,
})
因为我已经确认 有多少行的服务长度小于一
(train['length_of_service']<= 1).sum()
5
接下来,我用以下两种条件填充数据框
train[(train.length_of_service <=1) & (train['Salary_hike'].isnull())]
employee_id Name Age length_of_service Salary_hike
0 103 A 20 1 NaN
2 103 C 21 1 NaN
9 110 J 21 1 NaN
现在,如何将上述筛选列表中缺少的加薪值填充为0
employee_id Name Age length_of_service Salary_hike
0 103 A 20 1 0
2 103 C 21 1 0
9 110 J 21 1 0
我使用了评论部分提到的命令,如:
train.loc[(train.length_of_service==-1) & (train['Salary_hike'].isnull()),'Salary_hike'] = 0
但我还是得到了缺失的值,如3
train.isnull().sum()
大家好,
感谢您的宝贵意见:
现在,它在使用以下命令后工作:
train.loc[(train.length_of_service <=1) & (train['Salary_hike'].isnull()),['Salary_hike']]=0
我相信你需要:
如果值为
-1
,则需要设置:相关问题 更多 >
编程相关推荐