我的数据框看起来像-
id marital_status age city1 city2
1 Married 32 7 64
2 Married 34 8 39
3 Single 53 0 72
4 Divorce 37 2 83
5 Divorce 42 10 52
6 Single 29 3 82
7 Married 37 8 64
数据帧的大小是2240万条记录
我的目标是基于条件语句我的最终数据帧看起来像-
id marital_status age city1 city2 present
1 Married 32 12 64 1
2 Married 34 8 39 0
3 Single 53 0 72 0
4 Divorce 37 2 83 0
5 Divorce 42 10 52 0
6 Single 29 3 82 0
7 Married 37 8 64 1
到目前为止我所做的-
test_df = pd.read_csv('city.csv')
condition = ((test_df['city1'] >= 5) &\
(test_df['marital_status'] == 'Married') &\
(test_df['age'] >= 32))
test_df.loc[:, 'present'] = test_df.where(condition, 1)
但在“当前”列中有NA值
有人能帮我吗
它不是
np.where
函数,而是解决方案中的DataFrame.where
我认为你需要根据条件设定值:
或由^{} 将
True/False
强制转换为1/0
:相关问题 更多 >
编程相关推荐