如果一个表中的时间戳介于另一个表中的两个时间戳之间，则使用Python将其递增1

calendar_timestamp TrucksWorking 2020-01-01 00:00:28 0 2020-01-01 00:01:28 0 2020-01-01 00:02:28 0 2020-01-01 00:03:28 0 2020-01-01 00:04:28 0 2020-01-01 00:05:28 0 ......

for each in coincident.calendar_timestamp: if (coincident[coincident['calendar_timestamp']] >= emergency_events[emergency_events['StartTime']] & coincident[coincident['calendar_timestamp']] <= emergency_events[emergency_events['EndTime']]): coincident[coincident['TrucksWorking']] = coincident[coincident['TrucksWorking']] + 1 else: coincident[coincident['TrucksWorking']]

# ============================================================================= # I have attempted the following # the following code returns an error message # ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). # ============================================================================= ## for each in coincident.calendar_timestamp: ## if (coincident[coincident['calendar_timestamp'].between(starting_point, ending_point)]): ## coincident[coincident['TrucksWorking']] = coincident[coincident['TrucksWorking']] + 1 ## else: ## coincident[coincident['TrucksWorking']] # ============================================================================= # I have attempted the following # a dead end code that I cannot make work # df = coincident[coincident['calendar_timestamp'].between(starting_point, ending_point)] # print(df.head(n = 5)) # ============================================================================= # ============================================================================= # I have attempted the following but it will not work # another dead end code # for timestamp in coincident_events.calendar: # print(coincident_events.calendar.query('coincident_events.calendar >= emergency_events.starting_point and coincident_events.calendar <= emergency_events.ending_point')) # =============================================================================

# -*- coding: utf-8 -*- # Python 3.7 Anaconda distribution import pandas as pd import datetime # ============================================================================= # Step 1: Read in the ambulance runs with a starting and ending time values # call this dataframe "emergency_events" # ============================================================================= # the following array is a small sample when an ambulance starts a call and when it ends a call data = [['2020-01-01 00:00:28.000','2020-01-01 00:35:28.987'] , ['2020-01-01 00:02:34.000','2020-01-01 01:05:13.540'] , ['2020-01-01 00:03:57.000','2020-01-01 01:14:44.537'] , ['2020-01-01 00:06:17.000','2020-01-01 01:26:52.087'] , ['2020-01-01 00:13:20.000','2020-01-01 01:17:31.310'] , ['2020-01-01 00:14:01.000','2020-01-01 01:57:28.343'] , ['2020-01-01 00:16:11.000','2020-01-01 00:39:34.967'] , ['2020-01-01 00:22:03.000','2020-01-01 01:46:40.037'] , ['2020-01-01 00:23:07.000','2020-01-01 00:49:25.890'] , ['2020-01-01 00:23:19.000','2020-01-01 01:26:39.920'] , ['2020-01-01 00:30:17.000','2020-01-01 00:38:05.000']] #convert the array to a pandas data frame emergency_events = pd.DataFrame(data, columns = ['StartTime', 'EndTime']) #convert the string values to date time values emergency_events['StartTime'] = pd.to_datetime(emergency_events['StartTime']) emergency_events['EndTime'] = pd.to_datetime(emergency_events['EndTime']) # ============================================================================= # Step 2 Create a calendar of date time stamps incremented by 1 minute using the ambulance runs min/max values # call this dataframe "coincident" # ============================================================================= ## establish a starting value based on the first ambulance event starting_point = emergency_events.StartTime.min() print(starting_point) ## establish an ending value based on the final ambulance call ending time. ending_point = emergency_events.EndTime.max() print(ending_point) ## create a range of time stamps incremented by 1 minute from starting point to ending point days = pd.date_range(starting_point, ending_point, freq='min') ## create a pandas dataframe with two columns: calendar for time stamps and a place holder of 0 for trucks working coincident = pd.DataFrame({'calendar_timestamp': days, 'TrucksWorking': 0}) ## print it out to verify the data print(coincident.head(n = 5)) # ============================================================================= # Step 3 --- now for the difficult part # if a "coincident" time stamp is between a start and end time of an emergency_event # increment the TrucksWorking column by 1 # loop through every "coincident" observation and test if it is between a start and an end of an "emergency_event" # ============================================================================= for each in coincident.calendar_timestamp: if (coincident[coincident['calendar_timestamp']] >= emergency_events[emergency_events['StartTime']] & coincident[coincident['calendar_timestamp']] <= emergency_events[emergency_events['EndTime']]): coincident[coincident['TrucksWorking']] = coincident[coincident['TrucksWorking']] + 1 else: coincident[coincident['TrucksWorking']] ## at the end of this program it should return a calendar of date time stamps with ## the number of ambulances at work during that one minute interval. ## this information can be used for data modeling. # ============================================================================= # I have attempted the following # the following code returns an error message # ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). # ============================================================================= ## for each in coincident.calendar_timestamp: ## if (coincident[coincident['calendar_timestamp'].between(starting_point, ending_point)]): ## coincident[coincident['TrucksWorking']] = coincident[coincident['TrucksWorking']] + 1 ## else: ## coincident[coincident['TrucksWorking']] # ============================================================================= # I have attempted the following # a dead end code that I cannot make work # df = coincident[coincident['calendar_timestamp'].between(starting_point, ending_point)] # print(df.head(n = 5)) # ============================================================================= # ============================================================================= # I have attempted the following but it will not work # another dead end code # for timestamp in coincident_events.calendar: # print(coincident_events.calendar.query('coincident_events.calendar >= emergency_events.starting_point and coincident_events.calendar <= emergency_events.ending_point')) # ============================================================================= print(coincident.head(n = 20)) # ============================================================================= # Step 4: verify the "coincident" table is correct and then analyze the data # Printing the "coincident" dataframe should look something like: # ============================================================================= # StartTime TrucksWorking # 0 2020-01-01 00:00:28 1 # 1 2020-01-01 00:01:28 1 # 2 2020-01-01 00:02:28 1 # 3 2020-01-01 00:03:28 1 # 4 2020-01-01 00:04:28 2 # 5 2020-01-01 00:05:28 2 # 6 2020-01-01 00:06:28 3 # 7 2020-01-01 00:07:28 3 # 8 2020-01-01 00:08:28 3 # 9 2020-01-01 00:09:28 3 # 10 2020-01-01 00:10:28 3 # etc for a full calendar year of ambulance responses # ============================================================================= # Step 5: analyze the data looking for patterns of ambulance utilization. TBD # =============================================================================

1条回答

网友

1楼 · 发布于 2024-07-02 13:41:59

使用您的数据，我找到了以下解决方案。我只使用了2020年的前200分钟，但是你可以通过调整periods=200到每年的分钟数来轻松改变

我使用了以下variables： df对应于您的一致数据帧。从2020年1月1日起，我每分钟提前生成一次：

import pandas as pd
import datetime
df = pd.DataFrame()
df['time1'] = pd.date_range('2020-01-01 00:00:00', periods=200, freq='min')
df['trucks working'] = 0
print(df)

这给了我一年中所有卡车仍在工作的分钟数=0：

                  time1  trucks working
0   2020-01-01 00:00:00               0
1   2020-01-01 00:01:00               0
2   2020-01-01 00:02:00               0
3   2020-01-01 00:03:00               0
4   2020-01-01 00:04:00               0
..                  ...             ...
195 2020-01-01 03:15:00               0
196 2020-01-01 03:16:00               0
197 2020-01-01 03:17:00               0
198 2020-01-01 03:18:00               0
199 2020-01-01 03:19:00               0

使用您的紧急呼叫作为data

data = [['2020-01-01 00:00:28.000','2020-01-01 00:35:28.987']
    , ['2020-01-01 00:02:34.000','2020-01-01 01:05:13.540']
    , ['2020-01-01 00:03:57.000','2020-01-01 01:14:44.537']
    , ['2020-01-01 00:06:17.000','2020-01-01 01:26:52.087']
    , ['2020-01-01 00:13:20.000','2020-01-01 01:17:31.310']
    , ['2020-01-01 00:14:01.000','2020-01-01 01:57:28.343']
    , ['2020-01-01 00:16:11.000','2020-01-01 00:39:34.967']
    , ['2020-01-01 00:22:03.000','2020-01-01 01:46:40.037']
    , ['2020-01-01 00:23:07.000','2020-01-01 00:49:25.890']
    , ['2020-01-01 00:23:19.000','2020-01-01 01:26:39.920']
    , ['2020-01-01 00:30:17.000','2020-01-01 00:38:05.000']]

我添加列名称并命名生成的数据帧emergency_events：

 emergency_events = pd.DataFrame(data, columns = ['StartTime', 'EndTime'])

现在我可以迭代数据帧emergency_events

并增加'trucks working' 一天中的每一分钟在一行'StartTime'和'EndTime'之间

for index2, row2 in df.iterrows():
for index, row in emergency_events.iterrows():
    if pd.to_datetime(row['StartTime']) <= pd.to_datetime(row2['time1']) <= pd.to_datetime(row['EndTime']):
        #print(row2['trucks working'])
        #print(row['StartTime'],row2['time1'],row['EndTime'])
        df.at[index2,'trucks working'] += 1

这给了我一个数据帧，其中包含一天中每分钟的卡车数量

time1  trucks working
0   2020-01-01 00:00:00               0
1   2020-01-01 00:01:00               1
2   2020-01-01 00:02:00               1
3   2020-01-01 00:03:00               2
4   2020-01-01 00:04:00               3
..                  ...             ...
195 2020-01-01 03:15:00               0
196 2020-01-01 03:16:00               0
197 2020-01-01 03:17:00               0
198 2020-01-01 03:18:00               0
199 2020-01-01 03:19:00               0

相关问题更多 >

编程相关推荐

热门问题

热门文章