在python中如何检测具有nan值的字符串

2024-09-29 07:35:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用pandas打开一个文本文档,如下所示。在

input_data = pd.read_csv('input.tsv', header=0, delimiter="\t", quoting=3 )
L= input_data["title"] + '. ' + input_data["description"]

我发现我的一些文本等于nan。因此,我尝试了以下方法。在

^{pr2}$

但是,这返回了以下错误TypeError: must be real number, not str

有没有一种方法可以在python中标识字符串nan值?在

我的tsv如下所示

id  title   description major   minor
27743058    Partial or total open meniscectomy? : A prospective, randomized study.  In order to compare partial with total meniscectomy a prospective clinical study of 200 patients was carried out. At arthrotomy 100 patients were allocated to each type of operation. The two groups did not differ in duration of symptoms, age distribution, or sex ratio. The operations were performed as conventional arthrotomies. One hundred and ninety two of the patients were seen at follow up 2 and 12 months after operation. There was no difference in the period off work between the two groups. One year after operation, 6 of the 98 patients treated with partial meniscectomy had undergone further operation. In all posterior tears were found at both procedures. Among the 94 patients undergoing total meniscectomy, 4 required further operation. In each, part of the posterior horn had been left at the primary procedure. One year after operation significantly more patients who had undergone partial meniscectomy had been relieved of symptoms. However, the two groups did not show any difference in the degree of radiological changes present.    ### ###
27743057        Synovial oedema is a frequent complication in arthroscopic procedures performed with normal saline as the irrigating fluid. The authors have studied the effect of saline solution, Ringer lactate, 5% Dextran and 10% Dextran in normal saline on 12 specimens of human synovial membrane. They found that 10% Dextran in normal saline decreases the water content of the synovium without causing damage, and recommend this solution for procedures lasting longer than 30 minutes. ### ###

Tags: andoftheininputdatanotoperation
2条回答

你的数据帧很难复制。以下是df示例:

df = pd.DataFrame([["11","1", np.nan], [np.nan,"1", "2"], ['abc','def','ijk']],
             columns=["ix","a", "b"])
>>df

    a   b   c
0   11  1   NaN
1   NaN 1   2
2   abc def ijk

来自文档:df.dropna()

^{pr2}$

这将返回所有列中没有nan的行。 输出:

    a   b   c
2   abc def ijk

对于筛选没有任何nan的列:

df.dropna(axis=1)

    b
0   1
1   1
2   def

要查找包含nan的行:

df_nan= df.drop(list(df.dropna().index))

另外,请检查how=内置函数,该函数允许您根据所选轴删除anyall行/列的na值。在

第一个问题是math.isnan()不接受字符串值作为输入。您可以尝试查看math.isnan('any string')。在

因为您已经在pandas数据框中,所以最好使用pandas来处理您的案例。例如:

df.dropna()           # column-wise nan drop
df.dropna(axis=1)     # row-wise nan drop

请注意,dropna()中有一些非常有用的参数,因此请从doctring或相应的手动条目中查看这些参数。在

作为旅途中的一个建议,当你和熊猫一起工作时,最好记住,通常你想做的任何事情,只在原生熊猫的功能范围内就更容易完成。因为熊猫是这类工作的黄金标准,一般来说,无论你想做什么(如果有意义的话),熊猫社区已经考虑过(并实施了)。在

相关问题 更多 >