如何使用Python3和ANTLR4修改java源代码?

2024-06-17 11:21:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一组Java源代码,需要修改这些.Java文件(删除空格、注释等)。为此,我从this存储库下载了Java lexer和解析器文件,并使用antlr-4.7.2进行了编译-完成.jar. 我还使用pip安装了antlr4-python3-runtime。你知道吗

我尝试用下面的代码删除示例HelloWorld程序中的多行注释,但得到了以下回溯。如何解决这个问题?你知道吗

对于编译lexer和解析器:

java -jar [path_to_antlr-4.7.2-complete.jar] -Dlanguage=Python3 [path_to_lexer_file]
java -jar [path_to_antlr-4.7.2-complete.jar] -Dlanguage=Python3 [path_to_parser_file]

java文件示例:

public class HelloWorld {

    public static void main(String[] args){
        /*
        System.out.println("Hello World");
        */
    }

}

用于更改文件的Python代码:

source = open("./HelloWorld.java", "r")
codeStream = InputStream(source.read())
lexer = JavaLexer.JavaLexer(codeStream)
token_stream = CommonTokenStream(lexer)
token_stream.fill()
rewriter = TokenStreamRewriter.TokenStreamRewriter(token_stream)
for token in token_stream.tokens:
    if token.type == JavaLexer.JavaLexer.COMMENT:
        rewriter.deleteToken(token)

Traceback (most recent call last):
  File "/home/alp/PycharmProjects/JavaParsingTutorial/parser.py", line 31, in <module>
    rewriter.deleteToken(token)
  File "/usr/local/lib/python3.6/dist-packages/antlr4/TokenStreamRewriter.py", line 80, in deleteToken
    self.delete(self.DEFAULT_PROGRAM_NAME, token, token)
  File "/usr/local/lib/python3.6/dist-packages/antlr4/TokenStreamRewriter.py", line 88, in delete
    self.replace(program_name, from_idx, to_idx, None)
  File "/usr/local/lib/python3.6/dist-packages/antlr4/TokenStreamRewriter.py", line 71, in replace
    if any((from_idx > to_idx, from_idx < 0, to_idx < 0, to_idx >= len(self.tokens.tokens))):
TypeError: '>' not supported between instances of 'CommonToken' and 'CommonToken'

Tags: 文件topathinpytokenstreamjava