如果匹配检查elemen,Python将获得前面的几个元素

2024-05-07 15:09:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我在文本文件中有一些结构化数据:

解析.txt

name1
detail:
aaaaaaaa
bbbbbbbb
cccccccc
detail1:
dddddddd
detail2:
eeeeeeee
detail3:
ffffffff
detail4:
gggggggg

某些细节4没有数据,将替换为“-”:

name2
detail:
aaaaaaaa
bbbbbbbb
cccccccc
detail1:
dddddddd
detail2:
eeeeeeee
detail3:
ffffffff
detail4:
-

如何解析数据以获得detail1、detail2和detail3以下的元素,这些元素只包含空detail4s的数据?

到目前为止,我有一个部分工作的代码,但问题是,它得到每个项目40次。请帮忙。你知道吗

代码:

data = []
with open("parse.txt","r",encoding="utf-8") as text_file:
    for line in text_file:
        data.append(line)
det4li = []
finali= []

for elem,det4 in zip(data,data[1:]):
    if "detail4" in elem:
        det4li .append(det4)
        if "-" in det4:
            for elem1,det1,det2,det3 in zip(data,data[1:],data[3:],data[5:]):
                if "detail1:" in elem1:
                    finali.append(det1.strip() + "," + det2.strip() + "," + det3)

Current Output: 40 records of dddddddd,eeeeeeee,ffffffff

Desired Output: dddddddd,eeeeeeee,ffffffff


Tags: 数据intxtfordataifappenddetail2
1条回答
网友
1楼 · 发布于 2024-05-07 15:09:45

不要向前看。通过存储前面的数据,查看后面的

final = []
with open("parse.txt","r",encoding="utf-8") as text_file:
    section = {}
    last_header = None
    for line in text_file:
        line = line.strip()
        if line.startswith('detail'):
            # detail line, record for later use
            last_header = line.rstrip(':')
        elif not last_header:
            # name line, store as such
            section['name'] = line
        else:
            section[last_header] = line
            if last_header == 'detail4':
                # section complete, process
                if line == '-':
                    # A section we want to keep
                    final.append(section)
                # reset section data
                section, last_header = {}, None

这样做还有一个额外的优点,即现在不需要将整个文件读入内存。如果将其转换为生成器(通过将其放入函数并用yield section替换final.append(section)行),甚至可以在读取文件时处理那些匹配的部分,而不会牺牲可读性。你知道吗

相关问题 更多 >