Pyparsing无法分析多个规则

2024-10-06 19:27:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试用一些特殊的规则创建布尔查询解析器,比如相邻值和接近值。到目前为止我创造的规则是

## DEFINITIONS OF SYMBOLS ###
NEAR = CaselessLiteral('near').suppress()
NUMBER = Word(nums)
NONEDIRECTIONAL = Combine(NEAR+NUMBER)
ADJ = CaselessLiteral("ADJ").setParseAction(replaceWith('0'))
OAND = CaselessLiteral("and")
OOR = CaselessLiteral("or")
ONOT = CaselessLiteral("not")

## ----------------------- ##
## DEFINITIONS OF TERMS ###
# Do not break quoted string.
QUOTED = quotedString.setParseAction(removeQuotes)

# space-separated words are easiest to define using just OneOrMore
# must use a negative lookahead for and/not/or operators, and this must come
# at the beginning of the expression
WORDWITHSPACE = OneOrMore(~(OAND | ONOT | OOR | NONEDIRECTIONAL | ADJ) +
                          Word(printables, excludeChars="()"))

# use a parse action to recombine words into a single string
WORDWITHSPACE.addParseAction(lambda t: ' '.join(t))

TERM = (QUOTED | WORDWITHSPACE)
## ----------------------- ##
## DEFINITIONS OF Expresion ###

EXPRESSION = infixNotation(TERM,
                           [
                               (ADJ, 2, opAssoc.LEFT),
                               (NONEDIRECTIONAL, 2, opAssoc.LEFT),
                               (ONOT, 1, opAssoc.RIGHT),
                               (Optional(OAND, default='and'), 2, opAssoc.LEFT),
                               (OOR, 2, opAssoc.LEFT)
                           ])
# As we can have more than one occurances of symbols together we are
# using `OneOrMore` Exprestions

BOOLQUERY = OneOrMore(EXPRESSION) + StringEnd()
## ----------------------- ##

当我跑的时候

((a or b) and (b and c)) or (a and d)

很好用

而当我试图解析

((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))

它无法处理。你知道吗

有人能帮我找出哪里出了问题吗?你知道吗

更新代码:

EXPRESSION = infixNotation(TERM,
                           [
                               (ONOT, 1, opAssoc.RIGHT),
                               (Optional(OAND, default='and'), 2, opAssoc.LEFT),
                               ((OOR | NONEDIRECTIONAL | ADJ), 2, opAssoc.LEFT)
                           ])

因为这样的情况

x not y not z


Tags: orandofnotleftdefinitionsooradj
1条回答
网友
1楼 · 发布于 2024-10-06 19:27:07

你的程序需要很长时间,因为你的infixNotation有5层深,并且有一个可选的AND操作符。你知道吗

我只需要启用packrat解析就可以按原样运行它。为此,请在脚本顶部添加以下内容(在导入pyparsing之后):

ParserElement.enablePackrat()

为了运行测试,我使用了runTests。我不清楚为什么需要BOOLQUERY,因为您只是在解析表达式:

tests = """\
((a or b) and (b and c)) or (a and d)
((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
"""
EXPRESSION.runTests(tests)

提供:

((a or b) and (b and c)) or (a and d)
[[[['a', 'or', 'b'], 'and', ['b', 'and', 'c']], 'or', ['a', 'and', 'd']]]
[0]:
  [[['a', 'or', 'b'], 'and', ['b', 'and', 'c']], 'or', ['a', 'and', 'd']]
  [0]:
    [['a', 'or', 'b'], 'and', ['b', 'and', 'c']]
    [0]:
      ['a', 'or', 'b']
    [1]:
      and
    [2]:
      ['b', 'and', 'c']
  [1]:
    or
  [2]:
    ['a', 'and', 'd']


((((smart ADJ contract*) and agreement) or (enforced near3 without near3 interaction) or (automated ADJ escrow)) or ((protocol* or Consensus ADJ algorithm) near5 (agreement and transaction)))
[[[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']], 'or', [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]]]
[0]:
  [[[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']], 'or', [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]]
  [0]:
    [[['smart', '0', 'contract*'], 'and', 'agreement'], 'or', ['enforced', '3', 'without', '3', 'interaction'], 'or', ['automated', '0', 'escrow']]
    [0]:
      [['smart', '0', 'contract*'], 'and', 'agreement']
      [0]:
        ['smart', '0', 'contract*']
      [1]:
        and
      [2]:
        agreement
    [1]:
      or
    [2]:
      ['enforced', '3', 'without', '3', 'interaction']
    [3]:
      or
    [4]:
      ['automated', '0', 'escrow']
  [1]:
    or
  [2]:
    [['protocol*', 'or', ['Consensus', '0', 'algorithm']], '5', ['agreement', 'and', 'transaction']]
    [0]:
      ['protocol*', 'or', ['Consensus', '0', 'algorithm']]
      [0]:
        protocol*
      [1]:
        or
      [2]:
        ['Consensus', '0', 'algorithm']
    [1]:
      5
    [2]:
      ['agreement', 'and', 'transaction']

相关问题 更多 >