СLabelMaker(composeml库)每周的更改开始日

2024-09-27 17:50:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试为一个客户和一周的时间段创建标签。 我需要labeling_function计算从周一(含)到周一(不含)的销售金额。 但现在是从星期天算到星期天。 如何更改LabelMaker一周的开始日期

def total_spent(df):
    total = df['amount'].sum()
    return total

label_maker = cp.LabelMaker(
    target_entity="customer_id",
    time_index="transaction_time",
    labeling_function=total_spent,
    window_size="W",
)

Tags: df客户returntimedeffunction标签金额
1条回答
网友
1楼 · 发布于 2024-09-27 17:50:55

谢谢你的提问。您可以通过将窗口大小设置为W-MON来获取周一的每周频率。我将用这些数据快速演示一个示例

import pandas as pd

records = []
for time in  pd.date_range(start='2020-11-16', periods=15, freq='d'):
    record = {'transaction_time': time, 'day_name': time.day_name()}
    records.append(record)

df = pd.DataFrame(records).assign(customer_id=0)
transaction_time   day_name  customer_id
      2020-11-16     Monday            0
      2020-11-17    Tuesday            0
      2020-11-18  Wednesday            0
      2020-11-19   Thursday            0
      2020-11-20     Friday            0
      2020-11-21   Saturday            0
      2020-11-22     Sunday            0
      2020-11-23     Monday            0
      2020-11-24    Tuesday            0
      2020-11-25  Wednesday            0
      2020-11-26   Thursday            0
      2020-11-27     Friday            0
      2020-11-28   Saturday            0
      2020-11-29     Sunday            0
      2020-11-30     Monday            0

在label maker中,我将窗口大小设置为W-MON。这是星期一每周频率的偏移别名。窗口大小还支持来自pandas的许多其他offset aliases

import composeml as cp

lm = cp.LabelMaker(
    target_entity='customer_id',
    time_index='transaction_time',
    window_size='W-MON',
)

让我们检查一下label maker生成的数据切片。你应该在周一得到一个每周的频率

slices = lm.slice(df, -1)
next(slices)
                   day_name  customer_id
transaction_time                        
2020-11-16           Monday            0
2020-11-17          Tuesday            0
2020-11-18        Wednesday            0
2020-11-19         Thursday            0
2020-11-20           Friday            0
2020-11-21         Saturday            0
2020-11-22           Sunday            0
next(slices)
                   day_name  customer_id
transaction_time                        
2020-11-23           Monday            0
2020-11-24          Tuesday            0
2020-11-25        Wednesday            0
2020-11-26         Thursday            0
2020-11-27           Friday            0
2020-11-28         Saturday            0
2020-11-29           Sunday            0

相关问题 更多 >

    热门问题