str.replace()更改数据类型

2024-09-28 16:58:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我已经导入了数据。空字段显示为nan。列的数据类型是float、string、object等的混合体。我想用“N/a”替换“na”,使替换不区分大小写。我使用了来自python: better way to handle case sensitivities with df.replace的以下代码来执行此操作:

# replace NA w N/A
dfMSR = dfMSR.apply(lambda x: x.astype(str).str.replace(r'\bna\b', 'N/A', regex=True,case=False))

当我运行上述代码时,所有列的数据类型都更改为“object”。这会产生许多问题,包括以下问题:

a = dfMSR.copy()
a = a[['AppBaselineType', 'RvwBaselineTypeAction', 'RvwBaselineType']]
a['AppBaselineType'] = np.where(((a['RvwBaselineTypeAction'].isnull()) | 
                                 (a['RvwBaselineTypeAction'] == '-') | 
                                 (a['RvwBaselineTypeAction'] == 'N/A')), 
                                a['RvwBaselineType'], a['RvwBaselineTypeAction'])

nan不会被RvwBaselineType中的值替换,因为它们已更改为实际文本“nan”

a.describe() #provides the result:

       AppBaselineType RvwBaselineTypeAction RvwBaselineType
count              292                   292             292
unique               4                     4               4
top                nan                   nan        Existing
freq               251                   251             154

print(dfMSR['RvwBaselineTypeAction'].isnull().sum()) #provides the result:

0

#replace isnull() with == nan gives the desired output
a['AppBaselineType'] = np.where(((a['RvwBaselineTypeAction'] == 'nan') | 
                                 (a['RvwBaselineTypeAction'] == '-') | 
                                 (a['RvwBaselineTypeAction'] == 'N/A')), 
                                a['RvwBaselineType'], a['RvwBaselineTypeAction'])

理想情况下,我希望在不更改(丢失原始)数据类型的情况下运行replace。有什么建议吗

#raw data:
RvwBaselineType RvwBaselineTypeAction   AppBaselineType
Existing        nan                     nan
Existing        -                       nan
nan             nan                     nan
Existing        N/A                     nan
Existing        ABC                     nan 

#desired output
RvwBaselineType RvwBaselineTypeAction   AppBaselineType
Existing        nan                     Existing        
Existing        -                       Existing        
nan             nan                     nan
ESDffogr        N/A                     ESDffogr        
Existing        ABC                     ABC


Can share a sample file if someone can tell me how to do this on SO. 
Thanks

Tags: thetoobjectwithnanreplace数据类型case