如何解析日志中没有分隔符的数据?

2024-10-01 04:51:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我得根据从别人那里收到的日志做些分析。 单独分析每个日志非常耗时,因此考虑使用python和pandas编写一个脚本来自动完成这项工作。 但是数据是混合的,因此我无法解析它

日志如下:

14:34:41: [REQ][LS1]->[TUT2] [12]FF00000000000000000088DD (Message1)
14:34:41: [REQ][TUT2]->[LS1] [09]5203000C0C0C0C0E0E (Message2)
14:34:49: [REQ][LS1]->[TUT2] [12]FF00000000000000000088DD (Message1)
14:34:49: [REQ][TUT2]->[LS1] [09]5203000C0C0C0C0E0E (Message2)
14:34:56: [REQ][LS1]->[TUT2] [12]FF00000000000000000088DD (Message1)
14:34:57: [REQ][TUT2]->[LS1] [09]5203000C0C0C0C0E0E (Message2)
14:35:04: [REQ][LS1]->[TUT2] [12]FF00000000000000000088DD (Message1)
14:35:05: [REQ][TUT2]->[LS1] [09]5203000C0C0C0C0E0E (Message2)
14:35:05: [REQ][TUT2]->[000] [25]DB03FFFFFF7F00000000FF7F0000FF7F00FA0FF90F00000000 (Debug Message)

我需要这样的输出

FF 00 00 00 00 00 00 00 00 00 88 DD
52 03 00 0C 0C 0C 0C 0E 0E
FF 00 00 00 00 00 00 00 00 00 88 DD
52 03 00 0C 0C 0C 0C 0E 0E
FF 00 00 00 00 00 00 00 00 00 88 DD
52 03 00 0C 0C 0C 0C 0E 0E
FF 00 00 00 00 00 00 00 00 00 88 DD
52 03 00 0C 0C 0C 0C 0E 0E
DB 03 FF FF FF 7F 00 00 00 00 FF 7F 00 00 FF 7F 00 FA 0F F9 0F 00 00 00 00

这样我就可以分析数据了

我使用以下代码来解析数据

import pandas as pd
# Read File
filename = "file.txt"
df = pd.read_table(filename, sep=' ',\
                   names=['Time','Src-Dst','Data','Type','Remarks'],\
                   engine='python',header=None)
df.head()

但我不明白如何将这些数据分解成单独的列

[12]2A00000000000000000088DD

谁能帮帮我吗


Tags: 数据脚本pandasdffilenamereqddpd
1条回答
网友
1楼 · 发布于 2024-10-01 04:51:18

使用pd.Series.str.findall

df['Data'].str[4:].str.findall('(.{2})')

输出:

0     [FF, 00, 00, 00, 00, 00, 00, 00, 00, 00, 88, DD]
1                 [52, 03, 00, 0C, 0C, 0C, 0C, 0E, 0E]
2     [FF, 00, 00, 00, 00, 00, 00, 00, 00, 00, 88, DD]
3                 [52, 03, 00, 0C, 0C, 0C, 0C, 0E, 0E]
4     [FF, 00, 00, 00, 00, 00, 00, 00, 00, 00, 88, DD]
5                 [52, 03, 00, 0C, 0C, 0C, 0C, 0E, 0E]
6     [FF, 00, 00, 00, 00, 00, 00, 00, 00, 00, 88, DD]
7                 [52, 03, 00, 0C, 0C, 0C, 0C, 0E, 0E]
8    [DB, 03, FF, FF, FF, 7F, 00, 00, 00, 00, FF, 7...
Name: Data, dtype: object

如果要将其作为数据帧,请创建新的数据帧:

s = df['Data'].str[4:].str.findall('(.{2})')
pd.DataFrame(list(s))

输出:

   0   1   2   3   4   5   6   7   8     9   ...     15    16    17    18  \
0  FF  00  00  00  00  00  00  00  00    00  ...   None  None  None  None   
1  52  03  00  0C  0C  0C  0C  0E  0E  None  ...   None  None  None  None   
2  FF  00  00  00  00  00  00  00  00    00  ...   None  None  None  None   
3  52  03  00  0C  0C  0C  0C  0E  0E  None  ...   None  None  None  None   
4  FF  00  00  00  00  00  00  00  00    00  ...   None  None  None  None   
5  52  03  00  0C  0C  0C  0C  0E  0E  None  ...   None  None  None  None   
6  FF  00  00  00  00  00  00  00  00    00  ...   None  None  None  None   
7  52  03  00  0C  0C  0C  0C  0E  0E  None  ...   None  None  None  None   
8  DB  03  FF  FF  FF  7F  00  00  00    00  ...     7F    00    FA    0F   

     19    20    21    22    23    24  
0  None  None  None  None  None  None  
1  None  None  None  None  None  None  
2  None  None  None  None  None  None  
3  None  None  None  None  None  None  
4  None  None  None  None  None  None  
5  None  None  None  None  None  None  
6  None  None  None  None  None  None  
7  None  None  None  None  None  None  
8    F9    0F    00    00    00    00  

相关问题 更多 >