如何根据日期时间筛选列表?

2024-10-04 11:22:36 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我的清单:

matched_rows_2 =[
    ['1', '07-09-2020', '8:43:02', '100', 'TTF'],
    ['2', '07-09-2020', '8:43:02', '100', 'GGY'],
    ['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
    ['4', '07-09-2020', '7:54:01', '160', 'JJH'],
    ['5', '07-09-2020', '8:30:00', '160', 'RRT'],
    ['6', '07-09-2020', '10:10:10', '160', 'PPO'],
    ['7', '07-09-2020', '11:12:11', '100', 'KKG'],
    ['8', '07-09-2020', '11:31:55', '160', 'PPO']]

我正在尝试执行以下操作:

  1. 对于每个车辆编号(index[3]),我试图获得一个日期时间最接近chosen_datetime的列表

我试过很多方法,但似乎还不起作用以下是我的code:

chosen_datetime = datetime.fromisoformat("2020-07-09 08:43:55+00:00")
dts = [datetime.strptime(sub[1] + ' ' + sub[2], "%d-%m-%Y  %H:%M:%S").replace(tzinfo=timezone.utc) for sub in matched_rows_2]

for x in matched_rows_2:
    closest_to_chosen_datetime = min(dts, key=lambda d: max( d, chosen_datetime) - min(d, chosen_datetime))
    if closest_to_chosen_datetime:
        print(x)

这是我想要的输出:

['1', '07-09-2020', '8:43:02', '100', 'TTF'],
['2', '07-09-2020', '8:43:02', '100', 'GGY'],
['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
['5', '07-09-2020', '8:30:00', '160', 'RRT'],

这是我当前的输出:

['1', '07-09-2020', '8:43:02', '100', 'TTF'],
['2', '07-09-2020', '8:43:02', '100', 'GGY'],
['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
['4', '07-09-2020', '7:54:01', '160', 'JJH'],
['5', '07-09-2020', '8:30:00', '160', 'RRT'],
['6', '07-09-2020', '10:10:10', '160', 'PPO'],
['7', '07-09-2020', '11:12:11', '100', 'KKG'],
['8', '07-09-2020', '11:31:55', '160', 'PPO']]

我真的不知道发生了什么,出了什么问题


Tags: infordatetimettfrowschosendtsclosest
1条回答
网友
1楼 · 发布于 2024-10-04 11:22:36

第一个问题是在循环中有一个print命令。由于在遍历所有项之后才知道哪些行的日期时间最接近chosen_datetime,因此这是过早的,也是导致输出错误的一个重要原因

其次,因为您要查找每辆车最近的日期时间 数字您需要一些逻辑来按车辆分组 号码

一种选择是使用itertools.groupby的解决方案;另一个 我在这里实现的解决方案是将结果存储在 由车辆编号键入的字典

下面的代码中有一些注释,但是如果 你需要一些额外的细节

from collections import defaultdict
from datetime import datetime, timezone, timedelta


matched_rows_2 = [
    ['1', '07-09-2020', '8:43:02', '100', 'TTF'],
    ['2', '07-09-2020', '8:43:02', '100', 'GGY'],
    ['3', '07-09-2020', '7:53:08', '120', 'HHJ'],
    ['4', '07-09-2020', '7:54:01', '160', 'JJH'],
    ['5', '07-09-2020', '8:30:00', '160', 'RRT'],
    ['6', '07-09-2020', '10:10:10', '160', 'PPO'],
    ['7', '07-09-2020', '11:12:11', '100', 'KKG'],
    ['8', '07-09-2020', '11:31:55', '160', 'PPO']]

chosen_datetime = datetime.fromisoformat("2020-07-09 08:43:55+00:00")
dts = [
    datetime.strptime(f'{row[1]} {row[2]}', '%m-%d-%Y %H:%M:%S').replace(tzinfo=timezone.utc)
    for row in matched_rows_2
]

mindelta = defaultdict(lambda: None)
minrows = defaultdict(lambda: None)

# use zip() to combine the timestamps in dts with the
# original data
for ts, row in zip(dts, matched_rows_2):
    # get the absolute difference from chosen_datetime
    delta = abs((ts - chosen_datetime).total_seconds())
    vid = row[3]

    # if it's the closest value for this vid (or if we haven't
    # processed the vid yet), update mindelta[vid] with the current
    # delta and set minrows[vid] to the current row.
    if mindelta[vid] is None or delta < mindelta[vid]:
        mindelta[vid] = delta
        minrows[vid] = [row]

    # if the current delta is equal to the existing closest delta,
    # just append the current row.
    elif delta == mindelta[vid]:
        minrows[vid].append(row)

for vid, rows in minrows.items():
    for row in rows:
        print(row)

运行上述程序会产生以下输出:

['1', '07-09-2020', '8:43:02', '100', 'TTF']
['2', '07-09-2020', '8:43:02', '100', 'GGY']
['3', '07-09-2020', '7:53:08', '120', 'HHJ']
['5', '07-09-2020', '8:30:00', '160', 'RRT']

相关问题 更多 >