查询数据库中的多个列并查找它们的和

2024-07-02 13:29:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个学校申请表,看起来像这样。。。你知道吗

create table todel (user_id int, SchemesApplicable1 int, SchemesApplicable2 int, 
SchemesApplicable3 int, SchemesApplicable4 int);

insert into todel values (1, 1, 0, 1, 0);

insert into todel values (2, 0, 0, 0, 0);

insert into todel values (3, 1, 0, 1, 0);

insert into todel values (4, 1, 0, 0, 0);

insert into todel values (5, 1, 0, 1, 1);

    SELECT Count(User_Id) as No_Off_Application , 
 sum(if(SchemesApplicable1 = 1, 1, 0)) as first,
sum(if(SchemesApplicable2 = 1, 1, 0))  as second, 
sum(if(SchemesApplicable3 = 1, 1, 0))  as third, 
sum(if(SchemesApplicable4 = 1, 1, 0))  as forth 
FROM todel

上面的查询将返回如下报告。。。你知道吗

No_Off_Application  first   second  third   forth
5   4   0   3   1

我想再加上一个栏,列出申请多个计划的申请人。 预期计数为3(用户id为1、3和5) 如何为此编写查询?你知道吗


Tags: noidifasintinsertvaluessum
2条回答
SELECT Count(User_Id) as No_Off_Application , 
       sum(SchemesApplicable1) as first,
       sum(SchemesApplicable2) as second,
       sum(SchemesApplicable3) as third,
       sum(SchemesApplicable4) as forth,
       sum(SchemesApplicable1 + SchemesApplicable2 + SchemesApplicable3 + SchemesApplicable4 >= 1) as users_at_least_with_one_application
FROM todel

以下是熊猫的设置:

df = pd.DataFrame([[1, 1, 0, 1, 0], 
                   [2, 0, 0, 0, 0,], 
                   [3, 1, 0, 1, 0], 
                   [4, 1, 0, 0, 0], 
                   [5, 1, 0, 1, 1]], 
              columns=['user_id', 'Scheme1', 'Scheme2', 'Scheme3', 'Scheme4'])
print(df)

   user_id  Scheme1  Scheme2  Scheme3  Scheme4
0        1        1        0        1        0
1        2        0        0        0        0
2        3        1        0        1        0
3        4        1        0        0        0
4        5        1        0        1        1

使用pandas,要检查每个用户的方案总数,可以使用df.sum(axis=1)

print(df.iloc[:, 1:].sum(1))

0    2
1    0
2    2
3    1
4    3
dtype: int64

要获取user_ids,可以使用布尔索引:

user_id_ser = df.user_id[df.iloc[:, 1:].sum(1) > 1]
print(user_id_ser)

0    1
2    3
4    5
Name: user_id, dtype: int64

要添加“Flag/Indicator”列,您需要使用> 1创建掩码,并使用df.astype转换为整数:

df['Schemes > 1'] = (df.iloc[:, 1:].sum(1) > 1).astype(int)
print(df)

   user_id  Scheme1  Scheme2  Scheme3  Scheme4  Schemes > 1
0        1        1        0        1        0            1
1        2        0        0        0        0            0
2        3        1        0        1        0            1
3        4        1        0        0        0            0
4        5        1        0        1        1            1

最后,要获得准确的输出,可以使用df.where

print(df.where(df > 0).count())

user_id        5
Scheme1        4
Scheme2        0
Scheme3        3
Scheme4        1
Schemes > 1    3
dtype: int64

相关问题 更多 >