正则表达式替换java注释

2024-10-01 02:26:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在做一个用python解析java文件的项目

我的程序的输入是:

public class TestMax {
    /** Main method */
    public static void main(String args[]){
        System.out.println("hello world");
    }
}

预期产出为:

<span class="keyword">public</span> <span class="keyword">class</span> TestMax {
        <span class="comment">      /** Main method */</span>
        public <span class="keyword">static</span> <span class="keyword">void</span> main(String args[]){
                System.out.println("hello world");
        }
}

实际输出为:

<span class="keyword">public</span> <span class="keyword">class</span> TestMax {
        /** <span class="comment">      /**</span> Main method */
        public <span class="keyword">static</span> <span class="keyword">void</span> main(String args[]){
                System.out.println("hello world");
        }
}

这种方法有点不对劲:

def print_html():
    """Print html"""
    text = txt_edit.get(1.0, tk.END)
    
    new_text=[]
    for line in text.split("\n"):
        line_text = []
        for word in line.split(" "):
            if word in keywords:
                line_text.append('<span class="keyword">'+word+'</span>')
            else:
                line_text.append(word)

            if re.search("\*(.|[\r\n])*?\*", word):
                line_text.append('<span class="comment">'+word+'</span>')

        new_text.append(' '.join(line_text))
    new_text = '\n'.join(new_text)
            
    print(new_text)

问题在于:

if re.search("\*(.|[\r\n])*?\*", word):
                line_text.append('<span class="comment">'+word+'</span>')

这很接近

def print_html():
    """Print html"""
    text = txt_edit.get(1.0, tk.END)
    # single line comment (\/\*.+\*\/)
    # multi-line line comment (\/*.+)|(.+*\/)
    regex = r"(\/\*.+\*\/)" 
    new_text = []
    enteredMatches = False
    for line in text.split("\n"):
        line_text = []

        matches = re.finditer(regex, line, re.MULTILINE)
        for word in line.split(" "):
            if word in keywords:
                line_text.append('<span class="keyword">' + word + '</span>')
            else:
                line_text.append(word)

        for match in matches:
            line_text.append('<span class="comment">' + match.group() + '</span>')
            enteredMatches = True

        if enteredMatches:
            enteredMatches = False
            new_text.append(' '.join(line_text))
            continue

        new_text.append(' '.join(line_text))
    new_text = '\n'.join(new_text)

    print(new_text)

输出结果为

<span class="keyword">public</span> <span class="keyword">class</span> TestMax {
        /** TestMax */ <span class="comment">/** TestMax */</span>
        public <span class="keyword">static</span> <span class="keyword">void</span> main(String args[]){
                System.out.println("Hi");
        }
}

它添加了两次注释行。理想情况下,这将产生程序开始时显示的预期输出


Tags: textinnewforiflinecommentpublic
1条回答
网友
1楼 · 发布于 2024-10-01 02:26:07

请查找更正的代码以获得预期的输出:

def print_html():
    """Print html"""
    text = \
"public class TestMax {\n\
    /* First comment */\n\
    Second comment */\n\
    public static void main(String args[]){\n\
        System.out.println(\"hello world\");\n\
    }\n\
}"

    # single line comment (?:\/\*.+\*\/)
    # multi-line line comment (?:\/\*.+)|(?:.+\*\/)
    regex = r"(?:\/\*.+)|(?:.+\*\/)"
    new_text = []
    enteredMatches = False
    for line in text.split("\n"):
        line_text = []

        matches = re.finditer(regex, line, re.MULTILINE)

        for match in matches:
            line_text.append('<span class="comment">' + match.group() + '</span>')
            enteredMatches = True

        if enteredMatches:
            enteredMatches = False
            new_text.append(' '.join(line_text))
            continue

        for word in line.split(" "):
            if word in ['static', 'void', 'public', 'class']:
                line_text.append('<span class="keyword">' + word + '</span>')
            else:
                line_text.append(word)

        new_text.append(' '.join(line_text))
    new_text = '\n'.join(new_text)

    print(new_text)

请接受我的输入匹配解决方案。我临时添加了它以使其工作。但主要的想法是向你展示正则表达式的比赛和分组部分,让你有一个想法

相关问题 更多 >