检查应用于数据帧的np.nan内部函数

2024-09-29 23:20:43 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我的密码:

import numpy as np
import pandas as pd

df_i2b2 = pd.DataFrame({'id':[1,2,3,4],
                        'DIAGNOSIS_CODES':["338.29; 353.6; 355.9; 722.6; 724.2; E43",
                               "278.00; 300.00; 305.1; 353.6",
                               "E66.9; F32.9; F41.9; J96.10; S06",
                               np.nan]})


def diag_TBI(my_str):
    if my_str.isna():
        return np.nan
    else:
        return 1



df_i2b2['TBI_var'] = df_i2b2['DIAGNOSIS_CODES'].apply(diag_TBI)
df_i2b2['TBI_var'].value_counts(dropna = False)

输出:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
c:\Arch\Work\Register\a_Code\Snippets\i2b2test_NA_SO_Question.py in 
     18 
     19 
---> 20 df_i2b2['TBI_var'] = df_i2b2['DIAGNOSIS_CODES'].apply(diag_TBI)
     21 df_i2b2['TBI_var'].value_counts(dropna = False)
     22 

~\Anaconda3\envs\lake_reg2\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
   3846             else:
   3847                 values = self.astype(object).values
-> 3848                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3849 
   3850         if len(mapped) and isinstance(mapped[0], Series):

pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()

c:\Arch\Work\Register\a_Code\Snippets\i2b2test_NA_SO_Question.py in diag_TBI(my_str)
     11 
     12 def diag_TBI(my_str):
---> 13     if my_str.isna():
     14         return np.nan
     15     else:

AttributeError: 'str' object has no attribute 'isna'



问题:如何检查应用于它的函数中的数据帧中的np.nan?如何修改if my_str.isna():,使代码正常工作


Tags: inpandasdfifvarmylibnp
3条回答
def diag_TBI(my_str):
    if my_str == "nan":
        return np.nan
    return 1

尽管@Celiussting,但她的答案更适合这样做。您可以使用以下技巧检查函数中的NaN:

def diag_TBI(my_str):
    if my_str==my_str:
        return 1
    return np.nan

df_i2b2['DIAGNOSIS_CODES'].apply(diag_TBI)

输出:

0    1.0
1    1.0
2    1.0
3    NaN
Name: DIAGNOSIS_CODES, dtype: float64

注意:在Python中,NaN==NaN返回False

您可以使用np.where()以向量化的方式轻松解决问题,而不需要apply函数:

df_i2b2['TBI_var'] = np.where(df_i2b2['DIAGNOSIS_CODES'].isna(),np.nan,1)

这将返回:

   id                          DIAGNOSIS_CODES  TBI_var
0   1  338.29; 353.6; 355.9; 722.6; 724.2; E43      1.0
1   2             278.00; 300.00; 305.1; 353.6      1.0
2   3         E66.9; F32.9; F41.9; J96.10; S06      1.0
3   4                                      NaN      NaN

相关问题 更多 >

    热门问题