如何获得带注释的语法树？

import sys import antlr4 from ECMAScriptLexer import ECMAScriptLexer from ECMAScriptParser import ECMAScriptParser def handleTree(tree, lvl=0): for child in tree.getChildren(): if isinstance(child, antlr4.tree.Tree.TerminalNode): print(lvl*'│ ' + '└─', child) else: handleTree(child, lvl+1) input = antlr4.FileStream(sys.argv[1]) lexer = ECMAScriptLexer(input) stream = antlr4.CommonTokenStream(lexer) parser = ECMAScriptParser(stream) tree = parser.program() handleTree(tree)

│ │ │ │ └─ var │ │ │ │ │ │ └─ i │ │ │ │ │ │ │ └─ = │ │ │ │ │ │ │ │ │ │ └─ 52 │ │ │ │ │ └─ ; │ │ │ └─ function │ │ │ └─ foo │ │ │ └─ ( │ │ │ └─ ) │ │ │ └─ { │ │ │ │ │ │ │ │ │ │ │ │ └─ console │ │ │ │ │ │ │ │ │ │ │ └─ . │ │ │ │ │ │ │ │ │ │ │ │ └─ log │ │ │ │ │ │ │ │ │ │ │ └─ ( │ │ │ │ │ │ │ │ │ │ │ │ │ │ └─ 'hey' │ │ │ │ │ │ │ │ │ │ │ └─ ) │ │ │ │ │ │ │ │ │ └─ ; │ │ │ └─ } └─ <EOF>

1条回答

网友

1楼 · 发布于 2024-06-25 23:18:58

So, why comments should not be included in the parser and how to get a tree including comments?

如果从规则^{cd2>}中删除^{cd1>}

MultiLineComment
 : '/*' .*? '*/' -> channel(HIDDEN)
 ;

然后^{cd2>}最终会在解析器中结束。但是，然后，您的每个解析器规则都需要包含允许它们的标记。

例如，使用^{cd4>}解析器规则：

^{pr2}$

由于这是JavaScript中的有效数组文字：

^{pr3}$

这意味着您需要使用^{{cd2>}标记丢弃所有解析器规则，如：

^{pr4}$

会变成一个大烂摊子。

编辑

从评论中：

So it's not possible to generate a tree including comments with antlr? Is there some hacks or other libraries to do this?

格罗森伯格的回答：

Antlr provides a convenience method for this task: BufferedTokenStream#getHiddenTokensToLeft. In walking the parse tree, access the stream to obtain the node associated comment, if any. Use BufferedTokenStream#getHiddenTokensToRight to get any trailing comment.

编辑

相关问题更多 >

编程相关推荐

热门问题

热门文章