用上一个d中的值填充缺少的日期的查询

2024-09-27 07:20:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我想为每个客户建立一个连续的日期表

假设我有这个数据框

 con = pyodbc.connect (....)

我之所以选择dateadd(day,-1,getdate()),是因为表中没有getdate()的数据,只有昨天的数据

SQL_Until_Today = pd.read_sql_query("Select date, customer,value from account where date < convert(date,dateadd(day,-1,getdate()))", con)

    account  = pd.dataframe(SQL_Until_Today , columns = ['date','customer','value'])

SQL_Today = pd.read_sql_query("Select date, customer,value from account where date = convert(date,dateadd(day,-1,getdate()))",con)
    account_Today = pd.dataframe(SQL_Today,columns =
    ['date', 'customer','value'])

    account = account.append(account_Today)

因此,从这两个数据框中,我得到了一个名为account的数据框,它看起来像这样:

date         customer value
2019-06-27    100       40
2019-06-28    100       30
2019-06-30    100       20
2019-07-01    100       10
2019-07-02    100       18
2019-06-21    200       460
2019-06-23    200       430
2019-06-24    200       410
2019-06-25    200       130
2019-06-26    200       210
2019-06-27    200       410
2019-06-28    200       310
2019-06-30    200       210
2019-07-01    200       110
2019-07-02    200       118

我需要为每个客户创建一个连续的日期表,从他在表中的最小日期开始

例如:

customer = 100 --> 2019-06-27
customer = 200 --> 2019-06-21

因此,我希望AccountDataFrame的输出是:

date         customer value
2019-06-27    100       40
2019-06-28    100       30
2019-06-29    100       30 *************** The most closer value before!
2019-06-30    100       20
2019-07-01    100       10
2019-07-02    100       18
2019-07-03    100       18 **************** The most closer value before!
2019-06-21    200       460
2019-06-22    200       460 *************** The most closer value before!
2019-06-23    200       430
2019-06-24    200       410
2019-06-25    200       130
2019-06-26    200       210
2019-06-27    200       410
2019-06-28    200       310
2019-06-29    200       310 *************** The most closer value before!
2019-06-30    200       210
2019-07-01    200       110
2019-07-02    200       118
2019-07-03    200       118 *************** The most closer value before!

如果有两个日期的差距,我还是想从最接近的日期取值

有什么帮助吗?我怎样才能有效地执行它


Tags: the数据mostsqltodaydatevalueaccount
1条回答
网友
1楼 · 发布于 2024-09-27 07:20:31

一种常见的方法是使用一个单独的“日期表”,每个有效日期包含(或超过)您需要查询的范围的一行。例如,在这种特殊情况下,以下表格就足够了:

date_table

date      
     
2019-06-15
2019-06-16
2019-06-17
2019-06-18
2019-06-19
2019-06-20
2019-06-21
2019-06-22
2019-06-23
2019-06-24
2019-06-25
2019-06-26
2019-06-27
2019-06-28
2019-06-29
2019-06-30
2019-07-01
2019-07-02
2019-07-03
2019-07-04
2019-07-05

根据你现有的数据

account

date        customer  value
               -
2019-06-27       100     40
2019-06-28       100     30
2019-06-30       100     20
2019-07-01       100     10
2019-07-02       100     18
2019-06-21       200    460
2019-06-23       200    430
2019-06-24       200    410
2019-06-25       200    130
2019-06-26       200    210
2019-06-27       200    410
2019-06-28       200    310
2019-06-30       200    210
2019-07-01       200    110
2019-07-02       200    118

您可以从一个查询开始,该查询包含每个客户的每个实际\u日期

SELECT date_table.date AS actual_date, cust.customer
FROM 
    date_table,
    (SELECT DISTINCT account.customer FROM account) cust
WHERE 
    date_table.date >= (SELECT MIN(account.date) FROM account)
    AND
    date_table.date <= (SELECT MAX(account.date) FROM account)

接下来,将上面的内容包装为子查询(命名为cust\u date),以确定每个客户的参考\u日期/实际\u日期

SELECT cust_date.actual_date AS actual_date, cust_date.customer, MAX(acc.date) AS reference_date
FROM 
    (
        SELECT date_table.date AS actual_date, cust.customer
        FROM 
            date_table,
            (SELECT DISTINCT account.customer FROM account) cust
        WHERE 
            date_table.date >= (SELECT MIN(account.date) FROM account)
            AND
            date_table.date <= (SELECT MAX(account.date) FROM account)
    ) cust_date
    INNER JOIN 
    account acc 
        ON acc.customer = cust_date.customer AND acc.date <= cust_date.actual_date
GROUP BY cust_date.actual_date, cust_date.customer

最后,将that包装为子查询(名为ref\u date),以基于reference\u date提取reference\u值

SELECT ref_date.actual_date, ref_date.customer, acc.value
FROM
    (
        SELECT cust_date.actual_date AS actual_date, cust_date.customer, MAX(acc.date) AS reference_date
        FROM 
            (
                SELECT date_table.date AS actual_date, cust.customer
                FROM 
                    date_table,
                    (SELECT DISTINCT account.customer FROM account) cust
                WHERE 
                    date_table.date >= (SELECT MIN(account.date) FROM account)
                    AND
                    date_table.date <= (SELECT MAX(account.date) FROM account)
            ) cust_date
            INNER JOIN 
            account acc 
                ON acc.customer = cust_date.customer AND acc.date <= cust_date.actual_date
        GROUP BY cust_date.actual_date, cust_date.customer
    ) ref_date
    INNER JOIN
    account acc
        ON acc.customer = ref_date.customer AND acc.date = ref_date.reference_date
ORDER BY ref_date.customer, ref_date.actual_date

产生

actual_date  customer  value
     -          -
2019-06-27        100     40
2019-06-28        100     30
2019-06-29        100     30
2019-06-30        100     20
2019-07-01        100     10
2019-07-02        100     18
2019-06-21        200    460
2019-06-22        200    460
2019-06-23        200    430
2019-06-24        200    410
2019-06-25        200    130
2019-06-26        200    210
2019-06-27        200    410
2019-06-28        200    310
2019-06-29        200    310
2019-06-30        200    210
2019-07-01        200    110
2019-07-02        200    118

相关问题 更多 >

    热门问题