使用lark解析器（ebnf grammar）解析罗马数字时出现意外字符错误

DIGIT: "0".."9" INT: DIGIT+ _L_PAREN: "(" _R_PAREN: ")" LCASE_LETTER: "a".."z" ROMAN_NUMERALS: "viii" | "vii" | "iii" | "ii" | "ix" | "vi" | "iv" | "v" | "i" | "x" ?start: qns_num qns_alphabet qns_part qns_num: INT? qns_alphabet: _L_PAREN LCASE_LETTER _R_PAREN | LCASE_LETTER _R_PAREN | LCASE_LETTER? qns_part: _L_PAREN ROMAN_NUMERALS _R_PAREN | ROMAN_NUMERALS _R_PAREN | ROMAN_NUMERALS?

1条回答

网友

1楼 · 发布于 2024-09-29 19:31:56

The reason this happens, is because both rules can be empty, which causes the lexer to always jump over one of them in order to match the terminal with the higher priority.
With one rule empty and the second one matched, the parser expects an EOF, not more input. The introduction of ( forces the rule to not be empty.
So, changing the priority on LCASE_LETTER won't help. But not allowing the first rule to be empty will.
The Earley algorithm will know how to resolve this ambiguity automatically.

我在lark-parsergithub页面上问了同样的问题。来自there的答案

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用lark解析器（ebnf grammar）解析罗马数字时出现意外字符错误

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >