假设我有一个数据帧:
date,ent_id,val,month_end,FLAG1,FLAG2,FLAG3,FLAG4
2021-03-23,101,61,2021-03-31,R,R,R,R
2021-03-12,103,64,2021-03-31,G,R,R,R
2021-03-15,101,32,2021-03-31,F,R,F,R
2021-04-01,103,39,2021-04-30,R,F,G,R
2021-04-02,101,71,2021-04-30,R,F,R,G
2021-04-02,103,79,2021-04-30,F,F,R,F
2021-04-30,101,51,2021-04-30,G,F,F,G
2021-04-30,103,53,2021-04-30,G,G,F,F
2021-05-31,101,28,2021-05-31,R,R,G,F
2021-05-31,103,26,2021-05-31,F,F,R,R
2021-05-31,103,61,2021-05-31,R,F,F,G
2021-05-31,101,47,2021-05-31,F,F,F,R
2021-06-06,101,45,2021-06-30,G,G,F,R
2021-06-06,103,78,2021-06-30,G,G,G,G
2021-06-07,101,23,2021-06-30,R,G,R,R
2021-06-07,103,31,2021-06-30,R,R,F,F
2021-07-31,101,14,2021-07-30,F,G,F,G
2021-07-31,103,25,2021-07-30,F,R,R,R
2021-07-31,101,82,2021-07-30,F,F,G,F
2021-07-31,103,15,2021-07-30,F,G,F,G
我想在以下(sql)条件的基础上创建一个新列,即cmpt
CASE WHEN (date =month_end) THEN (DECODE(FLAG1,R,10,G,20,F,-5)+DECODE(FLAG2,R,10,G,20,F,-5)+DECODE(FLAG3,R,10,G,20,F,-5))
ELSE (DECODE(FLAG2,R,10,G,20,F,-5)+DECODE(FLAG3,R,10,G,20,F,-5)+DECODE(FLAG4,R,10,G,20,F,-5))
END
因此,输出将如下所示:
date,ent_id,val,month_end,FLAG1,FLAG2,FLAG3,FLAG4,CMPT
2021-03-23,101,61,2021-03-31,R,R,R,R,30
2021-03-12,103,64,2021-03-31,G,R,R,R,30
2021-03-15,101,32,2021-03-31,F,R,F,R,15
2021-04-01,103,39,2021-04-30,R,F,G,R,25
2021-04-02,101,71,2021-04-30,R,F,R,G,25
2021-04-02,103,79,2021-04-30,F,F,R,F,0
2021-04-30,101,51,2021-04-30,G,F,F,G,10
2021-04-30,103,53,2021-04-30,G,G,F,F,35
2021-05-31,101,28,2021-05-31,R,R,G,F,40
2021-05-31,103,26,2021-05-31,F,F,R,R,0
2021-05-31,103,61,2021-05-31,R,F,F,G,0
2021-05-31,101,47,2021-05-31,F,F,F,R,-15
2021-06-06,101,45,2021-06-30,G,G,F,R,25
2021-06-06,103,78,2021-06-30,G,G,G,G,60
2021-06-07,101,23,2021-06-30,R,G,R,R,40
2021-06-07,103,31,2021-06-30,R,R,F,F,0
2021-07-31,101,14,2021-07-30,F,G,F,G,35
2021-07-31,103,25,2021-07-30,F,R,R,R,30
2021-07-31,101,82,2021-07-30,F,F,G,F,10
2021-07-31,103,15,2021-07-30,F,G,F,G,15
我已经用np.where和np.select尝试过了,但是没有成功
我首先将
FLAG
列代码映射到它们各自的值。然后,使用np.where
获得适当列的总和,其中df.date == df.month_end
和df.date != df.month_end
:或者,如果没有
numpy
,只需使用loc
查找df.date == df.month_end
所在的行,再次查找df.date != df.month_end
所在的行,并对相应情况下的正确列求和:相关问题 更多 >
编程相关推荐