如何在python中逐行解析并在tup中放入多个值

2024-10-04 09:27:10 发布

您现在位置:Python中文网/ 问答频道 /正文

每行有以下形式:

[id=52, idRegion=3857, tipo=New, CustomerDetails=[id=10, countryCode=DE, ... and so on

我要完成的是逐行读取一个元组,并用id、idRegion等值创建一个元组,如下所示

(52,3857,New,10,DE ....), (another line with tuple).... to later to put in an excel 

我试过了,但似乎离我想要的太远了:

a = re.findall( "id=(\d+),.idRegion=\d+, tipo=.*?,", file_txt)
b = re.findall( "id=\d+,.idRegion=(\d+),.tipo=.*?,", file_txt)
c = re.findall( "id=\d+,.idRegion=\d+,.tipo=(.*?),", file_txt)
d = [tuple(j for j in i if j)[-1] for i in a,b,c]
print c

Tags: toinretxtidnewforde
1条回答
网友
1楼 · 发布于 2024-10-04 09:27:10

我们不太了解您的输入数据格式。假设键仅由字母数字字符组成,值由字母数字和空格组成,则可以使用\w+=([\w\s]+?)[,\]]正则表达式来捕获值。通过re.findall()对下一行应用表达式:

import re


data = """
[id=52, idRegion=3857, tipo=New, CustomerDetails=[id=10, countryCode=DE]
[id=100, idRegion=11, tipo=New Something, CustomerDetails=[id=20, countryCode=DE]
"""

pattern = re.compile(r"\w+=([\w\s]+?)[,\]]")

print([
    tuple(pattern.findall(line)) for line in data.splitlines() if line
])

印刷品:

[
    ('52', '3857', 'New', '10', 'DE'), 
    ('100', '11', 'New Something', '20', 'DE')
]

相关问题 更多 >