将相同条件应用于多个列

2024-09-30 10:39:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含15个独立ICD列(ICD1ICD15)的数据帧,当数字“323”出现在15个ICD列中时,我想创建一个变量"Encep"(0/1)

dataframe本身包含30多个变量,如下所示

PT_FIN    DATE     Address...     ICD1    ICD2...      ICD15
1         July      123 lane        523    432         .
2         August    ABC road        523    43.6       12.8

不完全确定我是否在正确的轨道上,但我编写了以下代码,试图完成我的任务,但遇到了一个错误:

代码

ICDA = ["ICD1","ICD2","ICD3","ICD4","ICD5","ICD6","ICD7","ICD8","ICD9","ICD10","ICD11","ICD12","ICD13","ICD14","ICD15"]

ICD1.loc[:,"Encep"]=np.where(ICD1["ICDA"].str.contains("323", case=False),1,0)

错误

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2889             try:
-> 2890                 return self._engine.get_loc(key)
   2891             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'ICDA'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-34-564afcae6cd2> in <module>
      1 ICDA= ["ICD1","ICD2","ICD3","ICD4","ICD5","ICD6","ICD7","ICD8","ICD9","ICD10","ICD11","ICD12","ICD13","ICD14","ICD15"]
----> 2 ICD1.loc[:,"LumbPCode"]=np.where(ICD1["ICDA"].str.contains("323", case=False),1,0)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2973             if self.columns.nlevels > 1:
   2974                 return self._getitem_multilevel(key)
-> 2975             indexer = self.columns.get_loc(key)
   2976             if is_integer(indexer):
   2977                 indexer = [indexer]

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2890                 return self._engine.get_loc(key)
   2891             except KeyError:
-> 2892                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2893         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2894         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'ICDA'

编辑

我发现了一个类似的问题和答案,但需要知道如何应用这个select列,而不是整个数据框架

Finding string over multiple columns in Pandas


Tags: keyinselfpandasgetindexmethodloc
2条回答

您将文字字符串与变量混淆:

np.where(ICD1["ICDA"].str

表中没有列"ICDA"。列名是表的键;因此出现了错误

提示:您可能希望使用any函数检查是否至少有一列具有所需的属性。您可能会发现连接整行更容易或更快,并检查“323”是否出现在该字符串中

Keyerror是因为在名为ICDA的数据帧中没有列(即没有“键”)

对该列调用.str.contains,即使它存在,也没有意义,因为它似乎是一列列名

可能的解决方案

你有没有试着不用“ICDA”来称呼它

np.where(ICD1[ICDA].str.contains("323", case=False),1,0)

新解决方案

以下几点应该行得通

ICDA = ["ICD1","ICD2","ICD3","ICD4","ICD5","ICD6","ICD7","ICD8","ICD9","ICD10","ICD11","ICD12","ICD13","ICD14","ICD15"]

# if those cols aren't strings, make them (probably best to leave as float and compare, tho)
for col in ICDA:
    ICD1[col] = str(ICD1[col])

ICD1['Encep'] = (ICD1[ICDA].values == '323').any(1).astype(int)

对于将来的所有问题,请确保创建一个minimal reproducible example:)

相关问题 更多 >

    热门问题