Pyparsing Forward()语法递归

2024-10-04 03:27:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用Pyparsing来解析一个日志文件,它的块如下所示:

keyName0:                                     foo
keyName1:                                     bar
msgKey [Read]:                                21 FA 00 34
msgKey [Read]:
  MESSAGE 1 of 2
    keyName0:                                 keyValue0
    keyName1:                                 keyValue1
    Flags1:                                   No Flags Set
    Flags1:                                   0
    Flags2:                                   No Flags Set
    Flags2:                                   0
    keyName6:                                 $12AB34CD56EF (123456789)
    keyName7:                                 7
    keyName8:                                 7
    Data [Read]:                              00 01 02 03    04 05 06 07    08 09 10 11    12 13 14 15
                                              20 21 22 23    24 25 26 27    28 29 30 31    32 33 34 35
                                              36 37 38

msgKey [Read]:                                01 02 03 04
msgKey [Read]:
  MESSAGE 2 of 2
    # same structure as message above

keyName3:                                     keyValue3
keyName4 [IN]:                                keyValue4 (123 IN)
keyName4 [OUT]:                               keyValue4 (123 OUT)

我为keyName值行编写了一个语法:

^{pr2}$

这种语法对每一行都适用。现在我试着用这个语法来描述整个测试数据的语法:

message = Forward()
key_line = lineEnd + OneOrMore(Word(printables_no_column)).setParseAction(' '.join).setResultsName('keyName') + Suppress(':') \
       + MatchFirst(message, OneOrMore(Word(printables_no_column),stopOn=lineEnd).setParseAction(' '.join).setResultsName('keyValue'))
key_lines = ZeroOrMore(Group(key_line)).setResultsName('keys')
message << Literal('MESSAGE') + number + Literal('of')
           + number.setResultsName('totalMsgs') + key_lines

然而,我认为这种语法以无限递归结束。我需要帮助来理解如何正确使用Forward()递归语法。先谢谢你!在


Tags: ofkeynomessageread语法flagsset
1条回答
网友
1楼 · 发布于 2024-10-04 03:27:43

这会让你前进一点。总体而言,可能还需要更好的结构,但我认为基本的部分在这里。见嵌入注释:

import pyparsing as pp

# your original expression - x.setResultName("x") can now be written just x("x")
# key_line = (lineEnd
#             + OneOrMore(Word(printables_no_column)).setParseAction(' '.join)('keyName')
#             + Suppress(':')
#             + OneOrMore(Word(printables_no_column), stopOn=lineEnd).setParseAction(' '.join)('keyValue'))

# literals in your grammar will be suppressed by default
pp.ParserElement.inlineLiteralsUsing(pp.Suppress)

integer = pp.pyparsing_common.integer
hex_byte = pp.Word(pp.hexnums, exact=2)

# read everything up to ':' -  a little risky to define a Word including spaces, may want to revisit and
# explicitly parse bits, to detect "[IN]" vs "[OUT]", etc.
key_name_expr = pp.Word(pp.printables + " ", excludeChars=':')
key_line = pp.Group(key_name_expr("key_name") + ':'
                    + ~pp.lineEnd()  # make sure key value is on this same line
                    + pp.empty()     # handy trick to advance past white space
                    + pp.restOfLine()('key_value'))

# special key_line to read data bytes
data_body = "Data [Read]:" + pp.OneOrMore(hex_byte)

msg_body = ("msgKey [Read]:" + pp.lineEnd()
            + "MESSAGE" + integer("message_num") + "of" + integer("total_msgs")
            + pp.OneOrMore(pp.Group(key_line)("params*"), stopOn=data_body)
            + data_body("data"))

msg_expr = (pp.OneOrMore(pp.LineStart() + pp.Group(key_line)("params*"), stopOn=msg_body)
            + pp.Optional(pp.Group(msg_body)("body")))

使用字符串来查找匹配的块:

^{pr2}$

印刷品(节选如下):

[[['keyName1', '2']], [['msgKey [Read]', '21 FA 00 34']], ['\n', 1, 2, [['keyName0', 'keyValue0']], [['keyName1', 'keyValue1']], [['Flags1', 'No Flags Set']], [['Flags1', '0']], [['Flags2', 'No Flags Set']], [['Flags2', '0']], [['keyName6', '$12AB34CD56EF (123456789)']], [['keyName7', '7']], [['keyName8', '7']], '00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38']]
- body: ['\n', 1, 2, [['keyName0', 'keyValue0']], [['keyName1', 'keyValue1']], [['Flags1', 'No Flags Set']], [['Flags1', '0']], [['Flags2', 'No Flags Set']], [['Flags2', '0']], [['keyName6', '$12AB34CD56EF (123456789)']], [['keyName7', '7']], [['keyName8', '7']], '00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38']
  - data: ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38']
  - message_num: 1
  - params: [[['keyName0', 'keyValue0']], [['keyName1', 'keyValue1']], [['Flags1', 'No Flags Set']], [['Flags1', '0']], [['Flags2', 'No Flags Set']], [['Flags2', '0']], [['keyName6', '$12AB34CD56EF (123456789)']], [['keyName7', '7']], [['keyName8', '7']]]
    [0]:
      [['keyName0', 'keyValue0']]
      [0]:
        ['keyName0', 'keyValue0']
        - key_name: 'keyName0'
        - key_value: 'keyValue0'
    [1]:
      [['keyName1', 'keyValue1']]
      [0]:
        ['keyName1', 'keyValue1']
        - key_name: 'keyName1'
        - key_value: 'keyValue1'
    [2]:
      [['Flags1', 'No Flags Set']]
      [0]:
        ['Flags1', 'No Flags Set']
        - key_name: 'Flags1'
        - key_value: 'No Flags Set'
    [3]:
      [['Flags1', '0']]
      [0]:
        ['Flags1', '0']
        - key_name: 'Flags1'
        - key_value: '0'
    [4]:
      [['Flags2', 'No Flags Set']]
      [0]:
        ['Flags2', 'No Flags Set']
        - key_name: 'Flags2'
        - key_value: 'No Flags Set'
    [5]:
      [['Flags2', '0']]
      [0]:
        ['Flags2', '0']
        - key_name: 'Flags2'
        - key_value: '0'
    [6]:
      [['keyName6', '$12AB34CD56EF (123456789)']]
      [0]:
        ['keyName6', '$12AB34CD56EF (123456789)']
        - key_name: 'keyName6'
        - key_value: '$12AB34CD56EF (123456789)'
    [7]:
      [['keyName7', '7']]
      [0]:
        ['keyName7', '7']
        - key_name: 'keyName7'
        - key_value: '7'
    [8]:
      [['keyName8', '7']]
      [0]:
        ['keyName8', '7']
        - key_name: 'keyName8'
        - key_value: '7'
  - total_msgs: 2
- params: [[['keyName1', '2']], [['msgKey [Read]', '21 FA 00 34']]]
  [0]:
    [['keyName1', '2']]
    [0]:
      ['keyName1', '2']
      - key_name: 'keyName1'
      - key_value: '2'
  [1]:
    [['msgKey [Read]', '21 FA 00 34']]
    [0]:
      ['msgKey [Read]', '21 FA 00 34']
      - key_name: 'msgKey [Read]'
      - key_value: '21 FA 00 34'
Msg 1/2
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38']

相关问题 更多 >