从两个文件中按升序显示时间

2024-05-04 04:31:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个带有变量和各自时间的文件。我希望在一个数据帧中获得输出,其中变量和时间按升序排列

evt_gts               evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O

下一个文件是:

timestamp            variable   
01-07-2019 13:25:03 RefSpd  
01-07-2019 13:25:10 EffRealized 
01-07-2019 13:25:30 ABHPosition 
01-07-2019 13:25:35 LinkVolt    
01-07-2019 13:25:36 BCPress 
01-07-2019 23:18:00 speed   
01-07-2019 23:18:05 temperature 
01-07-2019 23:31:00 speed   
01-07-2019 23:31:04 temperature 
01-07-2019 23:43:00 speed   
01-07-2019 23:43:05 temperature 

预期产出为:

timestamp     variable
01-07-2019 13:25:03 RefSpd  
01-07-2019 13:25:10 EffRealized 
01-07-2019 13:25:30 ABHPosition 
01-07-2019 13:25:35 LinkVolt    
01-07-2019 13:25:36 BCPress 
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
01-07-2019 23:18:00 speed   
01-07-2019 23:18:05 temperature 
01-07-2019 23:31:00 speed   
01-07-2019 23:31:04 temperature 
01-07-2019 23:43:00 speed   
01-07-2019 23:43:05 temperature 
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O

Tags: 文件数据时间variabletimestampevtspeedtemperature
2条回答

我试图重复这个问题。希望这对你有用

import io, pandas as pd
file1 = io.StringIO('''
evt_gts               evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O
''')

file2= io.StringIO('''
timestamp            variable   
01-07-2019 13:25:03 RefSpd  
01-07-2019 13:25:10 EffRealized 
01-07-2019 13:25:30 ABHPosition 
01-07-2019 13:25:35 LinkVolt    
01-07-2019 13:25:36 BCPress 
01-07-2019 23:18:00 speed   
01-07-2019 23:18:05 temperature 
01-07-2019 23:31:00 speed   
01-07-2019 23:31:04 temperature 
01-07-2019 23:43:00 speed   
01-07-2019 23:43:05 temperature 
''')

df1 = pd.read_csv(file1, delim_whitespace=True).reset_index()
df2 = pd.read_csv(file2, delim_whitespace=True).reset_index()

df1['date'] = pd.to_datetime(df1['index'] + " " + df1['evt_gts'])
df1 = df1[['date', 'evt_id']]
df1.columns = ['date', 'variable']

df2['date'] = pd.to_datetime(df2['index'] + " " + df2['timestamp'])
df2 = df2[['date', 'variable']]

df = pd.concat([df1, df2]).sort_values('date')
print(df)

结果

                  date     variable
0  2019-01-07 13:25:03       RefSpd
1  2019-01-07 13:25:10  EffRealized
2  2019-01-07 13:25:30  ABHPosition
3  2019-01-07 13:25:35     LinkVolt
4  2019-01-07 13:25:36      BCPress
0  2019-01-07 16:42:00      976162O
1  2019-01-07 16:42:30      976162O
5  2019-01-07 23:18:00        speed
6  2019-01-07 23:18:05  temperature
7  2019-01-07 23:31:00        speed
8  2019-01-07 23:31:04  temperature
9  2019-01-07 23:43:00        speed
10 2019-01-07 23:43:05  temperature
2  2019-04-07 15:03:20      976162O
3  2019-04-07 15:03:25      976162O
4  2019-05-07 10:20:00      976162O

首先需要在两个DataFrame中设置相同的列名,以便与rename正确对齐,然后是^{},最后是按timestamp列与^{}排序:

df11 = df1.rename(columns={'evt_gts':'timestamp','evt_id':'variable'})
df = pd.concat([df11, df2], ignore_index=True).sort_values('timestamp')

如果两个数据帧中列的顺序/数量相同:

df1.columns = df2.columns
df = pd.concat([df1, df2], ignore_index=True).sort_values('timestamp')

print (df)
              timestamp     variable
5   01-07-2019 13:25:03       RefSpd
6   01-07-2019 13:25:10  EffRealized
7   01-07-2019 13:25:30  ABHPosition
8   01-07-2019 13:25:35     LinkVolt
9   01-07-2019 13:25:36      BCPress
0   01-07-2019 16:42:00      976162O
1   01-07-2019 16:42:30      976162O
10  01-07-2019 23:18:00        speed
11  01-07-2019 23:18:05  temperature
12  01-07-2019 23:31:00        speed
13  01-07-2019 23:31:04  temperature
14  01-07-2019 23:43:00        speed
15  01-07-2019 23:43:05  temperature
2   04-07-2019 15:03:20      976162O
3   04-07-2019 15:03:25      976162O
4   05-07-2019 10:20:00      976162O

编辑:

如果两个文件中都有分隔符空格,则解决方案有点改变-想法是将列转换为read_csv中的datetimes,并通过参数header=Nonekiprows=1忽略标题:

import pandas as pd
from io import StringIO

temp="""evt_gts evt_id
01-07-2019 16:42:00 976162O
01-07-2019 16:42:30 976162O
04-07-2019 15:03:20 976162O
04-07-2019 15:03:25 976162O
05-07-2019 10:20:00 976162O"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename1.csv'
df1 = pd.read_csv(StringIO(temp), sep="\s+", header=None, skiprows=1, parse_dates=[[0,1]])
print (df1)
                  0_1        2
0 2019-01-07 16:42:00  976162O
1 2019-01-07 16:42:30  976162O
2 2019-04-07 15:03:20  976162O
3 2019-04-07 15:03:25  976162O
4 2019-05-07 10:20:00  976162O

temp="""timestamp variable
01-07-2019 13:25:03 RefSpd
01-07-2019 13:25:10 EffRealized
01-07-2019 13:25:30 ABHPosition
01-07-2019 13:25:35 LinkVolt
01-07-2019 13:25:36 BCPress
01-07-2019 23:18:00 speed
01-07-2019 23:18:05 temperature
01-07-2019 23:31:00 speed
01-07-2019 23:31:04 temperature
01-07-2019 23:43:00 speed
01-07-2019 23:43:05 temperature"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename2.csv'
df2 = pd.read_csv(StringIO(temp), sep="\s+", header=None, skiprows=1, parse_dates=[[0,1]])
print (df2)
                   0_1            2
0  2019-01-07 13:25:03       RefSpd
1  2019-01-07 13:25:10  EffRealized
2  2019-01-07 13:25:30  ABHPosition
3  2019-01-07 13:25:35     LinkVolt
4  2019-01-07 13:25:36      BCPress
5  2019-01-07 23:18:00        speed
6  2019-01-07 23:18:05  temperature
7  2019-01-07 23:31:00        speed
8  2019-01-07 23:31:04  temperature
9  2019-01-07 23:43:00        speed
10 2019-01-07 23:43:05  temperature

df = pd.concat([df1, df2], ignore_index=True).sort_values('0_1')
df.columns = ['timestamp', 'variable']
print (df)
             timestamp     variable
5  2019-01-07 13:25:03       RefSpd
6  2019-01-07 13:25:10  EffRealized
7  2019-01-07 13:25:30  ABHPosition
8  2019-01-07 13:25:35     LinkVolt
9  2019-01-07 13:25:36      BCPress
0  2019-01-07 16:42:00      976162O
1  2019-01-07 16:42:30      976162O
10 2019-01-07 23:18:00        speed
11 2019-01-07 23:18:05  temperature
12 2019-01-07 23:31:00        speed
13 2019-01-07 23:31:04  temperature
14 2019-01-07 23:43:00        speed
15 2019-01-07 23:43:05  temperature
2  2019-04-07 15:03:20      976162O
3  2019-04-07 15:03:25      976162O
4  2019-05-07 10:20:00      976162O

相关问题 更多 >