如何按半唯一值筛选列表

2024-09-21 04:37:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据集,我需要过滤的“独特”发生。基本上,我想删除同一用户一天多次购买同一产品的每一行,而不考虑设备的变化。在多次出现的情况下,我希望只保留第一行。你知道吗

数据:

datetime, device, product, user

  [
  ['2013-07-08 15:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 17:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 10:00:00', 'andr',     'Y',        'B'],
  ['2013-07-10 18:00:00', 'pc',       'Y',        'B'],
  ['2013-07-10 21:00:00', 'ipho',     'Y',        'B'],       <- second occurance of B getting Y that day
  ['2013-07-10 22:00:00', 'andr',     'Y',        'B'],       <- third occurance of B getting Y that day
  ['2013-07-10 02:00:00', 'ipho',     'Z',        'C'],
  ['2013-07-10 11:00:00', 'pc',       'Z',        'C']        <- second occurance of C getting Z that day
  ]

应过滤为:

  ['2013-07-08 15:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 17:00:00', 'pc',       'X',        'A'],
  ['2013-07-09 10:00:00', 'andr',     'Y',        'B'],
  ['2013-07-10 18:00:00', 'pc',       'Y',        'B'],
  ['2013-07-10 02:00:00', 'ipho',     'Z',        'C'],
  ['2013-07-10 11:00:00', 'pc',       'Z',        'C']

我该怎么做呢?你知道吗


Tags: of数据用户datetimethat产品device情况
1条回答
网友
1楼 · 发布于 2024-09-21 04:37:48

从datetime中去掉时间部分,然后将每个项存储在字典中(如果它还没有)。使用日期、产品、用户的元组作为字典的键。你知道吗

例如

 d = {}
 for datetime, device, product, user in table:
     date = datetime[:10]
     if (date, product, user) not in d:
         d[(date, product, user)] = [datetime, device, product, user]

相关问题 更多 >

    热门问题