创建包含多个条件的if语句,这些条件涉及lis中的特定df列和字符串

2024-09-30 16:33:28 发布

您现在位置:Python中文网/ 问答频道 /正文

基本上,我有一个来自调查的问题列表“列列表”,它也作为我的数据框中的列。在我的代码开始时,我将所有空白回答替换为“空”,但一些调查受访者没有回答他们应该回答的问题(如我的数据框的“EOPS/CARE”或“CalWORKs”列中是否标记了“1”,但与这些程序相关的问题中是否标记了“空”),所以我想在这些情况下把“空的”重新编码为“丢失的”,以准确地反映这一点

以下是我必须尝试纠正的代码:

list_of_columns = ['E1', 'E2', 'E3', 'E5', 'E11', 'E13', 'E14', 'E17', 'E18', 'E20', 'C2', 'C7', 'C8', 'C9', 'C11', 'C12', 'NU2', 'NU7', 'NU8', 'NU10', 'NU11', 'CAL1', 'CAL2', 'CAL3', 'CAL5', 'CAL10', 'CAL12', 'CAL14', 'CAL15', 'O1'] # list of survey questions that are also columns in my df. Questions with 'E' indicate they are related to EOPS/CARE, questions with 'CAL' indicated they are related to CalWORKs, etc. 

for question in list_of_columns:

    if 'E' in question and data_final['EOPS/CARE'] == 1: # if 'E' is in the question, and the column 'EOPS/CARE' in my df is equal to 1, replace all instances of "Empty" with "Missing"

        data_final[question] = np.where(data_final[question] == "Empty", "Missing", data_final[question])

    elif 'CAL' in question and data_final['CalWORKs'] == 1: # similarly, if  'CAL' is in the question, and the column 'EOPS/CARE' in my df is equal to 1, replace all instances of "Empty" with "Missing"

        data_final[question] = np.where(data_final[question] == "Empty", "Missing", data_final[question])

    else:

        pass

当我尝试执行时,我一直得到这样的结果:“ValueError:一个序列的真值是模糊的。使用a.empty、a.bool()、a.item()、a.any()或a.all()

这在Stata中很容易实现,但是我决定用Python来实现这一点,因为我的其余代码已经用Python了。我还在学这门语言,所以可能是因为语法。非常感谢


Tags: andoftheto代码indatais
1条回答
网友
1楼 · 发布于 2024-09-30 16:33:28

这只是一种快速的方法,可以让列到达您想要的位置

# as indicated in your question list_of_columns is also a column in df

df.loc[(df['list_of_columns'].str.contains('E')) & (df['EOPS/CARE'] == 1) & (df['Column name where empty would be present'] == 'Empty'),'Column Name where Empty would be present'] = 'Missing'

做同样的事情让其他条件发挥作用。我不能理解剩下的问题,但如果你澄清一下,我可以进一步帮助你

.loc会给你最大的帮助。检查文件

相关问题 更多 >