在Python中,如何计算两个事件的总和,直到条件中断?

2024-10-03 06:31:41 发布

您现在位置:Python中文网/ 问答频道 /正文

例如: 我想找到车辆的偏差: 偏差定义:

驾驶超过4小时,且不间断至少1小时。1小时的休息时间可分为15分钟。任何少于15分钟的休息将不被视为休息。 运输=车辆正在运行 &;停止=中断

要计算偏差: 连续驾驶超过4小时的每小时将计为1次。例如,如果车辆连续行驶5小时2分钟,则计数为2(第5小时和第6小时的2分钟)

无法构建此逻辑

以下是玩具数据链接:

https://drive.google.com/file/d/1oWl6_k5KxTkztKAaYb6nO2PcI3gBs2RH/view?usp=sharing

我已经尝试过这一点:但我不知道该怎么做:

out['StartDateTime'] = pd.to_datetime(out['StartDate'].dt.date.astype(str)+ ' '+ out['StartTime'].astype(str))
out['EndDateTime'] = pd.to_datetime(out['EndDate'].dt.date.astype(str)+ ' '+ out['EndTime'].astype(str))
out['Duration'] =  (out['EndDateTime'] - out['StartDateTime']).astype('timedelta64[m]')

itr = 0
run = 0
stop = 0
dfg = out.groupby(['companyid','Vehicle'])
df_newout = pd.DataFrame()
while itr in (enumerate(out)):
    
         
    if run < 240 & stop < 60:
         run = out[out['EventType'] == 'Transit']['Duration'].sum()
         stop = out[out['EventType'] == 'Stop']['Duration'].sum()
         run.append(itr)
         stop.append(itr)
       
        
itr = itr+1

Tags: torundatetimedtoutpd偏差stop
1条回答
网友
1楼 · 发布于 2024-10-03 06:31:41

这是一个有点复杂的问题要解决。不管怎样,这是解决办法。看看这是不是你想要的。根据目前提供的数据,没有违规行为

import pandas as pd
import datetime as dt
data = pd.read_csv("toydata.csv") 

out = pd.DataFrame(data)

#calculate duration

out['StartDateTime'] = pd.to_datetime(out['StartDate'].astype(str) + ' ' + out['StartTime'].astype(str))
out['EndDateTime'] = pd.to_datetime(out['EndDate'].astype(str)+ ' '+ out['EndTime'].astype(str))
out['Duration'] =  (out['EndDateTime'] - out['StartDateTime']).astype('timedelta64[m]')

#set violation and instances to default values
out['Violation'] = ''
out['Instances'] = 0

#use r for transit and s for stop to cumulate the duration
r = 0
s = 0

#use for checking if prev row matches. r and s will be reset if new pair found
prev_cid = ''
prev_veh = ''

#iterate through the dataframe
for i,j in out.iterrows():
    #check if we encountered new set of companyid and vehicle pair
    if j['companyid'] != prev_cid or j['Vehicle'] != prev_veh: s = r = 0

    #store current value for next iteration
    prev_cid = j['companyid']
    prev_veh = j['Vehicle']

    #add to transit (r) and stop (s) counters
    if j['EventType'] == 'Transit': r += j['Duration']
    if j['EventType'] == 'Stop ': s += j['Duration']

    #if stop is less than 60 mins and transit > 240, then flag that row as violation
    if s < 60:
        if r > 240 and j['EventType'] == 'Transit' : out['Violation'][i] = True 

    #if we reached 15 minutes of stop, then check for instances of violations
    if j['EventType']=='Stop ' and j['Duration'] >= 15:
        if r > 240: #more than 4 hrs gets 1 violation
            r-=240
            r = 1 + int(r/60) #for every 60 mins > 4hrs, add a violation
            out['Instances'][i-1] = r #record the violation to the row above. current row is stop
        s = r = 0  #since we reached valid break, reset r and s

您可以添加这两行,以了解有多少违反了规则

newout = out[out['Instances'] > 0]
print(newout)

或者,您可以使用以下方法检查违规情况:

newout = out[out['Violation'] == True]
print(newout)

根据提供的数据,我没有发现任何违规行为。如果您手动计数并发现违规,请告诉我,以便修复代码中的逻辑

相关问题 更多 >