如何在Python中为特定类型的字母数字单词创建正则表达式

import os import re # Regex used to match relevant loglines (in this case) line_regex = re.compile(r"[A-Z]+IOS_[A-Z]+[0-9]+", re.IGNORECASE) # Output file, where the matched loglines will be copied to output_filename = os.path.normpath("output.log") # Overwrites the file, ensure we're starting out with a blank file with open(output_filename, "w") as out_file: out_file.write("") # Open output file in 'append' mode with open(output_filename, "a") as out_file: # Open input file in 'read' mode with open("ServerError.txt", "r") as in_file: # Loop over each log line for line in in_file: # If log line matches our regex, print to console, and output file if (line_regex.search(line)): print(line) out_file.write(line)

3条回答

网友

1楼 · 编辑于 2024-10-04 03:19:44

我们终于找到了完美的答案。它将只提取所需的字符串，并消除与模式相关的其他值。你知道吗

在这里，我用另一个优化搜索结果重新匹配（）在其最终发送到outfile之前调用。你知道吗

import os
import re

# Regex used to match relevant loglines (in this case, a specific IP address)
line_regex = re.compile(r"error", re.IGNORECASE)

line_regex = re.compile(r"[A-Z]+OS_[A-Z]+[0-9]+", re.IGNORECASE)


# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("output.log")
# Overwrites the file, ensure we're starting out with a blank file
with open(output_filename, "w") as out_file:
    out_file.write("")

# Open output file in 'append' mode
with open(output_filename, "a") as out_file:
    # Open input file in 'read' mode
    with open("ServerError.txt", "r") as in_file:
        # Loop over each log line
        for line in in_file:
            # If log line matches our regex, print to console, and output file
            if (line_regex.search(line)):

                # Get index of last space
                last_ndx = line.rfind(' ')
                # line[:23]: The time stamp (first 23 characters)
                # line[last_ndx:]: Last space and following characters
                # using match object to eliminate other strings which are associated with the pattern ,
                # need the string from which the request ID is in the last index
                matchObj = re.match(line_regex, line[last_ndx+1:])
                #print(matchObj)
                #check if matchobj is not null
                if matchObj:
                    print(line[:23] + line[last_ndx:])
                    out_file.write(line[:23] + line[last_ndx:])

网友

2楼 · 编辑于 2024-10-04 03:19:44

您可以匹配一个或多个大写字符[A-Z]+、下划线_，然后匹配零个或多个[A-Z]*次大写字符，后跟一个或多个数字[0-9]+。你知道吗

Use可能使用word boundary\b，因此它不是长匹配的一部分。你知道吗

\b[A-Z]+_[A-Z]*[0-9]+\b

Regex demo

网友

3楼 · 编辑于 2024-10-04 03:19:44

一个regexp就可以了。常见的线程似乎都是大写字母alpha，以TEC_开头，后面是字母和数字，所以。。。你知道吗

[A-Z]+TEC_[A-Z]+[0-9]+

参见https://regexr.com/3qveu了解测试。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章