对于大型数据集，更快地将列中的1和0替换为NAN问题的回答

对于大型数据集，更快地将列中的1和0替换为NAN

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

“azdias”是一个数据框，它是我的主要数据集，元数据或其特征摘要位于数据框“feat_info”中。“feat_信息”显示每列中显示为NaN的值 Ex:column1的值[-1,0]为NaN值。因此，我的工作将是在第1列中找到并替换这些-1,0作为NaN azdias数据帧： <a href="https://i.stack.imgur.com/l0RHc.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/l0RHc.png" alt="enter image description here"/></a> 专长信息数据帧： <a href="https://i.stack.imgur.com/LjryJ.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/LjryJ.png" alt="enter image description here"/></a> 我试着在jupyter笔记本中跟踪 <pre><code>def NAFunc(x, miss_unknown_list): x_output = x for i in miss_unknown_list: try: miss_unknown_value = float(i) except ValueError: miss_unknown_value = i if x == miss_unknown_value: x_output = np.nan break return x_output for cols in azdias.columns.tolist(): NAList = feat_info[feat_info.attribute == cols]['missing_or_unknown'].values[0] azdias[cols] = azdias[cols].apply(lambda x: NAFunc(x, NAList)) </code></pre> 问题1：我试图估算NaN值。但是我的代码非常简单缓慢的我希望加快我的执行过程 我已附上两个数据帧的示例： 阿兹迪亚斯样本 <pre><code> AGER_TYP ALTERSKATEGORIE_GROB ANREDE_KZ CJT_GESAMTTYP FINANZ_MINIMALIST 0 -1 2 1 2.0 3 1 -1 1 2 5.0 1 2 -1 3 2 3.0 1 3 2 4 2 2.0 4 4 -1 3 1 5.0 4 </code></pre> 专长信息样本 <pre><code>attribute information_level type missing_or_unknown AGER_TYP person categorical [-1,0] ALTERSKATEGORIE_GROB person ordinal [-1,0,9] ANREDE_KZ person categorical [-1,0] CJT_GESAMTTYP person categorical [0] FINANZ_MINIMALIST person ordinal [-1] </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

对于大型数据集，更快地将列中的1和0替换为NAN

1 个回答

相关Python问题