在python中读取.txt文件时，将每两行合并

junk junk junk --- intermediate: 1489 pi0 111 [686] (1491,1492) 0.534 -0.050 -0.468 0.724 0.135 1499 pi0 111 [690] (1501,1502) -1.131 0.503 12.751 12.812 0.135 --- final: 32 e- 11 [7] 9.072 20.492 499.225 499.727 0.001 33 e+ -11 [6] -11.317 -17.699 2632.568 2632.652 0.001 12 s 3 [10] (91) >43 {+5} 2.946 0.315 94.111 94.159 0.500 14 g 21 [11] (60,61) 34>>16 {+7,-6} -0.728 3.329 5.932 6.907 0.950 ------------------------------------------------------------------------------ junk junk --- intermediate: repeat

start = False for line in myfile: line = line.strip() fields = line.split() if len(fields)==0: continue if not start: if fields[0] == "----final:": start = True continue

3条回答

网友

1楼 · 编辑于 2024-09-24 00:32:04

一种快速而肮脏的方法来合并其他行：

for i in range(0,len(lines),2):

    fields1 = lines[i].strip().split()
    fields2 = lines[i+1].strip().split()
    print("\t".join(fields1[:4]+fields2))

注意，我在这里考虑到所有要合并的行都被提取并放入一个名为lines的列表中，并且我只是硬编码了从每一行中保留的元素的数量（4）。在

网友

2楼 · 编辑于 2024-09-24 00:32:04

只要你知道你想要的部分周围的确切线条：

#split the large text into lines
lines = large_text.split('\n')
#get the indexes of the beginning and end of your target section
idx_start = lines.index(" - final:")
idx_finish= lines.index("                                       ")
#iterate through the section in steps of 2, split on spaces, remove empty strings, print them as tab delimited
for idx in range( idx_start+1, idx_finish, 2):
    out = list(filter(None,(lines[idx]+lines[idx+1]).split(" ")))
    print("\t".join(out))

其中large_text是作为巨型字符串导入的文件。在

编辑为了打开文件_文本.txt'作为字符串，请尝试以下操作：

^{pr2}$

假设

你知道把兴趣区分开的线（例如：“-最终：”）
您的值是空格而不是制表符分隔的。如果不更改split(" ")为split("\t")

应该是赢家 添加了格式固定到一组行。同样的假设也成立。在

with open('./large_text.txt','r') as f:
    #split the large text into lines
    lines = f.read().split("\n")
    #get the indexes of the beginning and end of your target section
    idx_start = lines.index(" - final:")
    idx_finish= lines.index("                                       ")
    #iterate through the section in steps of 2, split on spaces, remove empty strings, print them as tab delimited
    for idx in range( idx_start+1, idx_finish, 2):
        line_spaces = list(filter(None,lines[idx].split(" ")))[0:4]
        other_line = list(filter(None,(lines[idx+1]).split(" ")))
        out = line_spaces + other_line
        print("\t".join(out))

网友

3楼 · 编辑于 2024-09-24 00:32:04

您可以使用更新的^{}模块和一些正则表达式来解决您的问题：

import regex as re

rx = re.compile(r'''(?V1)
        (?:^ -\ final:[\n\r])|(?:\G(?!\A))
        ^(\ *\d+.+?)\ *$[\n\r]
        ^\ +(.+)$[\n\r]
        ''', re.MULTILINE | re.VERBOSE)

junky_string = your_string

matches = ["    ".join(match.groups()) 
            for match in rx.finditer(junky_string)
            if match.group(1) is not None]
print(matches)
# [' 32        e-      11 [7]    9.072    20.492   499.225   499.727     0.001', 
#  ' 33        e+     -11 [6]    -11.317   -17.699  2632.568  2632.652     0.001',
#  ' 12         s       3 [10] (91)  >43 {+5}    2.946     0.315    94.111    94.159     0.500', 
#  ' 14         g      21 [11] (60,61)  34>>16 {+7,-6}    -0.728     3.329     5.932     6.907     0.950']

它在行首或空格处查找 - final:，然后在匹配 - final:后紧跟数字（研究explanation on regex101.com以获取更多详细信息）。
然后用制表机将匹配的项目连接起来。在

相关问题更多 >

编程相关推荐

热门问题

热门文章