我应该使用regex或其他工具从日志文件中提取日志id吗?

2024-09-30 12:32:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一份包含以下数据的清单:

["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]

我想在python中使用正则表达式来获得这样的元组列表

[("asdf", "mkol ghth"),("dfcf", "5566 7766"),("7uy7", "jhjh ffvf")]

我尝试使用re.split,但出现了一个错误,说明要解包的值太多。以下是我的代码:

logTuples = [()]
    for log in logList:
        (logid, logcontent) = re.split(r"(\s)", log)
        logTuples.append((logid, logcontent))

Tags: 数据relog列表split元组asdflogid
2条回答

正则表达式在这里是多余的:

l = ["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]

lst = [tuple(i.split(maxsplit=1)) for i in l]

print(lst)

印刷品:

[('asdf', 'mkol ghth'), ('dfcf', '5566 7766'), ('7uy7', 'jhjh ffvf')]

根据文件:

https://docs.python.org/3/library/re.html

\s

For Unicode (str) patterns: Matches Unicode whitespace characters (which includes [ \t\n\r\f\v], and also many other characters, for example the non-breaking spaces mandated by typography rules in many languages). If the ASCII flag is used, only [ \t\n\r\f\v] is matched.

有2个空格,因此有3个项目

如果所有日志条目都有3个用空格分隔的项,并且您总是将它们组织为(1,2+“”+3),则不需要使用正则表达式将它们格式化为:

logtuples = []
for log in loglist:
    splitlog = log.split(" ") #3 total elements
    logtuples.append (splitlog[0], splitlog[1] + " " + splitlog[2])

相关问题 更多 >

    热门问题