在匹配的python正则表达式上提取数据

1条回答

网友

1楼 · 发布于 2024-10-02 18:18:45

我无法调试您的regex，因为它在示例中的格式似乎不正确，所以这里有一个关于如何调试的工作片段。当注释解释正则表达式是如何工作的时，请仔细检查它们。你知道吗

import re

# sample text as in the question
sample_str = """/*dummy comment */

/* comment about sum function */

int sum(int a,int b);

/*comment about mul function */ 

int mul(int a,int b);"""

# Match the regex below and capture its match into a backreference named “desc” (also backreference number 1) «(?P<desc>/\*\s*comment about (?P<func_name>[^\s]+?) function \*/\s*\r*\n*)»
#    Match the character “/” literally «/»
#    Match the character “*” literally «\*»
#    Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «\s*»
#       Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
#    Match the character string “comment about ” literally (case sensitive) «comment about »
#    Match the regex below and capture its match into a backreference named “func_name” (also backreference number 2) «(?P<func_name>[^\s]+?)»
#       Match a single character that is NOT a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «[^\s]+?»
#          Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
#    Match the character string “ function ” literally (case sensitive) « function »
#    Match the character “*” literally «\*»
#    Match the character “/” literally «/»
#    Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «\s*»
#       Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
#    Match the carriage return character «\r*»
#       Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
#    Match the line feed character «\n*»
#       Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
# Match the regex below and capture its match into a backreference named “fun” (also backreference number 3) «(?P<fun>(?P<return_type>[^\s]+?) (?P<func_name_2>[^\s]+?)\((?P<arguments>[^\)]+?)\))»
#    Match the regex below and capture its match into a backreference named “return_type” (also backreference number 4) «(?P<return_type>[^\s]+?)»
#       Match a single character that is NOT a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «[^\s]+?»
#          Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
#    Match the character “ ” literally « »
#    Match the regex below and capture its match into a backreference named “func_name_2” (also backreference number 5) «(?P<func_name_2>[^\s]+?)»
#       Match a single character that is NOT a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «[^\s]+?»
#          Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
#    Match the opening parenthesis character «\(»
#    Match the regex below and capture its match into a backreference named “arguments” (also backreference number 6) «(?P<arguments>[^\)]+?)»
#       Match any character that is NOT the closing parenthesis character «[^\)]+?»
#          Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
#    Match the closing parenthesis character «\)»

function_re = re.compile(r"(?P<desc>/\*\s*comment about (?P<func_name>[^\s]+?) function \*/)\s*\r*\n*(?P<fun>(?P<return_type>[^\s]+?) (?P<func_name_2>[^\s]+?)\((?P<arguments>[^\)]+?)\))")

for function_match in function_re.finditer(sample_str):
    # match start: function_match.start()
    # match end (exclusive): function_match.end()
    # matched text: function_match.group()
    print("\ndesc:\n\n{}\n".format(function_match.group("desc")))
    print("fun:\n\n{}\n\n    ".format(function_match.group("fun")))
    # Additional groups if you need them
    # print("Func Name 1: {}".format(function_match.group("func_name")))
    # print("Func Name 2: {}".format(function_match.group("func_name_2")))
    # print("Arguments  : {}".format(function_match.group("arguments")))

我得到的结果是：

desc:

/* comment about sum function */

fun:

int sum(int a,int b)

    

desc:

/*comment about mul function */

fun:

int mul(int a,int b)

相关问题更多 >

编程相关推荐

热门问题

热门文章

在匹配的python正则表达式上提取数据

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >