正则表达式重复模式

2024-09-30 16:28:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试使用正则表达式从下面的日志中捕获数据组。模式是

<item> : <key> = <value> , <key> = <value>, ..., <key> = <value>

([#\w\d]*?)[\s]*=[\s]*([.\w\d]*)可以捕获组<key>和组<value> 但是我也想捕获<item>组,所以我将上面的分组并使用{n}重复。在

^{pr2}$

20141207,07:15:52,0,>>RATIO: casher#=30, Value=2.579,Units=ratio,Error=N 20141207,07:15:52,0,>>RATIO: casher#=31, Value=4.509,Units=ratio,Error=N 20141207,07:15:52,0,>>RATIO: casher#=32, Value=3.735,Units=ratio,Error=N 20141207,07:15:52,0,>>RATIO: casher#=33, Value=2.401,Units=ratio,Error=N

20141207,07:15:52,0,>>CUSTOMER: casher#=30, Value=50,Units= count 20141207,07:15:52,0,>>CUSTOMER: casher#=31, Value=6,Units= count 20141207,07:15:52,0,>>CUSTOMER: casher#=32, Value=88,Units= count 20141207,07:15:52,0,>>CUSTOMER: casher#=33, Value=33,Units= count

pic1enter image description here

显然,结果并不是人们所期望的那样。谁能给我一些提示吗?我最终使用python来翻译成代码。谢谢您。在


Tags: 数据key代码valuecount模式errorcustomer
2条回答

您的文件是一个csv文件,因此您可以更轻松地使用csv模块:

import csv

f = open('data.txt', 'rb')

for row in csv.reader(f, delimiter=','):
    if row:
        item, key_and_val = row[3].split(':')
        item = item[2:]
        key, val = key_and_val.split('=')

        print item
        print '    {} => {}'.format(key.strip(), val.strip())

        for key_and_val in row[4:]:
            key, val = key_and_val.split('=')
            print '    {} => {}'.format(key.strip(), val.strip())

 output: 
RATIO
    casher# => 30
    Value => 2.579
    Units => ratio
    Error => N
RATIO
    casher# => 31
    Value => 4.509
    Units => ratio
    Error => N
RATIO
    casher# => 32
    Value => 3.735
    Units => ratio
    Error => N
RATIO
    casher# => 33
    Value => 2.401
    Units => ratio
    Error => N
CUSTOMER
    casher# => 30
    Value => 50
    Units => count
CUSTOMER
    casher# => 31
    Value => 6
    Units => count
CUSTOMER
    casher# => 32
    Value => 88
    Units => count
CUSTOMER
    casher# => 33
    Value => 33
    Units => count

your matching pattern also matched key=value even if the "item :" not exist, any advance technique to exclude those key = value line?

以下内容将跳过没有项目的行:

^{pr2}$
(?<=>>)(\w+):|([\w#]+)\s*=\s*(\S+?)(?:,|\s)

试试看这个。抓住这个捕获。请参阅演示。在

https://regex101.com/r/fA6wE2/1

^{pr2}$

相关问题 更多 >