我正在制作一个data frame并试图找到其中的“?”的数目。CSV的某些部分:
age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,class
25, Private,226802, 11th,7, Never-married, Machine-op-inspct, Own-child, Black, Male,0,0,40, United-States, <=50K
38, Private,89814, HS-grad,9, Married-civ-spouse, Farming-fishing, Husband, White, Male,0,0,50, United-States, <=50K
28, Local-gov,336951, Assoc-acdm,12, Married-civ-spouse, Protective-serv, Husband, White, Male,0,0,40, United-States, >50K
44, Private,160323, Some-college,10, Married-civ-spouse, Machine-op-inspct, Husband, Black, Male,7688,0,40, United-States, >50K
18, ?,103497, Some-college,10, Never-married, ?, Own-child, White, Female,0,0,30, United-States, <=50K
34, Private,198693, 10th,6, Never-married, Other-service, Not-in-family, White, Male,0,0,30, United-States, <=50K
29, ?,227026, HS-grad,9, Never-married, ?, Unmarried, Black, Male,0,0,40, United-States, <=50K
我正在使用
df.isin(['?']).sum(axis=0)
但是,对于所有列,它都返回0,尽管数据中有“?”
我怎么修理它
谢谢
在值之前有一个额外的空格:
一件事是你可以去掉这些值;-)
问题是在这个表中
?
会显示额外的空间。试试-一般来说,我建议事先格式化相关列
检查-
df.iloc[6]['occupation']
时可以看到额外的空间相关问题 更多 >
编程相关推荐