通过自定义查找所有以前的事件ID的最佳方法

import pandas as pd import numpy as np data = pd.read_csv('test.csv') data.to_dict() {'customerid': {0: 233, 1: 250, 2: 233, 3: 250, 4: 233}, 'eventid': {0: 'abc', 1: 'bcd', 2: 'edc', 3: 'abl', 4: 'cdl'}, 'date': {0: '2019-12-10', 1: '2019-12-08', 2: '2008-12-10', 3: '2019-12-01', 4: '2001-12-10'}, 'previouseventid': {0: 'edc', 1: 'abl', 2: 'cdl', 3: np.nan, 4: np.nan}}

temp = [cust_233['eventid'][0]] for i in range(len(cust_233['previouseventid'])-1): if pd.isna(cust_233['previouseventid'][i]) == False: # print(cust_233['previouseventid'][i]) temp.append(cust_233['previouseventid'][i]) else: # print('now exiting') break

3条回答

网友

1楼 · 编辑于 2024-05-19 14:43:08

试试这个：

data.sort_values('date', ascending=True).groupby('customerid', sort=False)['eventid'].agg(list)

输出：

customerid
233    [cdl, edc, abc]
250         [abl, bcd]
Name: eventid, dtype: object

网友

2楼 · 编辑于 2024-05-19 14:43:08

您可以创建如下列表：

df['previouseventid'] = df['customerid'].map(df.groupby('customerid')['eventid'].apply(list))

输出：

   customerid eventid        date  previouseventid
0         233     abc  2019-12-10  [abc, edc, cdl]
1         250     bcd  2019-12-08       [bcd, abl]
2         233     edc  2008-12-10  [abc, edc, cdl]
3         250     abl  2019-12-01       [bcd, abl]
4         233     cdl  2001-12-10  [abc, edc, cdl]

你知道吗数据框groupby（'customerid'）['eventid'].apply（list）将只获取列表

df.groupby('customerid')['eventid'].apply(list)                                                                                                                                     

customerid
233    [abc, edc, cdl]
250         [bcd, abl]
Name: eventid, dtype: object

网友

3楼 · 编辑于 2024-05-19 14:43:08

Groupby然后转移应该起作用：

# First, make sure your data is sorted from oldest to newest
df['date'] = pd.to_datetime(df['date'])
df.sort_values('date', inplace=True)

# Get previous event through groupby operation
df['prev_id'] = df.groupby('customerid')['eventid'].shift(1)

如果您想要每个客户的列表：

# create a dictionary with stored values – keys are customer id
prev_events_dict = df.groupby('customerid')['eventid'].apply(list).to_dict()
# map dict to dataframe
df['list_of_prev_id'] = df['customerid'].map(prev_events_dict)

相关问题更多 >

编程相关推荐

热门问题

热门文章