拆分文本并写入新文件

2024-10-01 07:12:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试拆分inputfile.txt中的数据并写入Outputfile.txt

  • 只考虑第一个单词&;删除第一个字母
  • 在列表中,第一行是root(Ex:a0),其余的是root 连接(Ex:a3)
  • 如果根目录已经存在(Ex:a0),则附加连接(Ex:a3, a9)至现有根(a0)
  • 在outputfile.txt中,在“:”之后,所有值用制表符分隔

inputfile.txt

LIST : 2
a0 n
a3
LIST : 2
a0
a9 k
LIST : 2
a3
a5
a6 l
a8
LIST : 2
a4
a5
a6
a8
Outputfile.txt    
LIST 0 :  3       9
LIST 3 :  5       6     8
LIST 4 :  5       6     8

我是python新手,我尝试过,但没有成功

def main():
 updatedData = ""
 with open('inputfile.txt', 'r') as f:
    for line in f:
        #print(line)
        test = line.split(" ",1)[0][1:]+ '\t'
        updatedData += test.replace('\n',' ')

Tags: testtxtlineroota0a3listex
2条回答

要求有点模糊(根是否应该排序?),但这里有一个工作示例(至少对于op中的输入文件和输出),我包括了一些注释来解释算法是如何工作的

roots_dict = {} # Data structure to hold the roots and connections
reading_root = False # A simple state machine to know whether we are reading a root
prev_root = None

# A little function to fill the requirement "Consider only first word & remove first letter"
def get_line_item(line):
    # This check is for when you reach the end of the file and there is no new line
    if len(line) == 0:
        return False

    newline_removed = line[:(-1 if line[-1] == '\n' else len(line))] # Remove the final character if it is a newline, otherwise slice  the whole line
    line_words = newline_removed.split(' ') # Split the characters on the line into a list of space separated words
    return line_words[0][1:] # Return the first word, and only the characters starting from the second (ie, the 1th element)

with open('inputfile.txt', 'r') as f:
    line = True
    while line:
        line = f.readline()
        if line[:4] == 'LIST':
            reading_root = True
            continue
        if reading_root:
            root = get_line_item(line)
            if root not in roots_dict:
                roots_dict[root] = []
            prev_root = root
            reading_root = False
            continue
        connection = get_line_item(line)
        if connection:
            roots_dict[prev_root].append(connection)

# Printing in the format as described by the op. This could easily be written to an output file
for k in roots_dict:
    print(f'LIST {k} :\t', end='')
    for i in roots_dict[k]:
        print(f'{i}\t', end='')
    print()
master = None
tracking = {}
for line in open('inputfile.txt'):
    if line.startswith("LIST"):
        master = None
    elif master:
        tracking[master].add(line[1])
    else:
        master = line[1]
        if master not in tracking:
            tracking[master] = set()

for key,val in tracking.items():
    print( "LIST %s :" % key, '\t'.join(val) )
[timr@TimsPro:~/src]$ python x.py
LIST 0 : 9  3
LIST 3 : 8  5   6
LIST 4 : 8  5   6

如果需要的话,在打印前进行分类很容易

编辑以处理多数字键。

master = None
tracking = {}
for line in open('inputfile.txt'):
    word = line.split()[0]
    if word == "LIST":
        master = None
    elif master:
        tracking[master].add(word[1:])
    else:
        master = word[1:]
        if master not in tracking:
            tracking[master] = set()

for key,val in tracking.items():
    print( "LIST %s :" % key, '\t'.join(val) )

编辑以不合并重复键:

如果我们不需要组合键,代码会更简单。我们根本不需要做任何全球跟踪。每个列表块都是独立的。您只需收集节点,直到到达块的末尾

key = None
for line in open('inputfile.txt'):
    word = line.split()[0]
    if word == "LIST":
        if key:
            print( "LIST %s :" % key, '\t'.join(gather) )
        key = None
    elif key:
        gather.append(word[1:])
    else:
        key = word[1:]
        gather = []

print( "LIST %s :" % key, '\t'.join(gather) )

相关问题 更多 >