无法从文件创建字典,因为有新行ch

2024-05-19 15:39:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个这样的文件:

Mother Jane
Father Bob
Friends Ricky,Jack,Brian,Jordan, \
        Ricardo,Sonia,Blake

正如你所看到的,我在“朋友”第一行的末尾有一个新行字符。当我想把这个文件解析成字典时,它给了我一个当前代码的错误。你知道吗

我已经在网上寻找解决方案,并尝试了多种方法,但似乎没有任何效果。你知道吗

with open('./file.txt') as f:
    content = f.readlines()

    dic = {}
    for line in content:
        line_items = line.strip().split()
        if len(line_items) <= 2:
            dic[line_items[0]] = line_items[1]
        else:
            dic[line_items[0]] = line_items[1:]

我想得到这样一个结果:

dict = {"Mother": "Jane", "Father": "Bob","Friends":[Ricky,Jack,Brian,Jordan,Ricardo,Sonia,Blake]

但是我得到了一个索引外错误。你知道吗


Tags: 文件lineitemsbobjackbrianfriendsricardo
3条回答

您可以使用以下方法:

import re
with open('file.txt') as f:
    c = f.read().strip()

#cleanup line breaks where comma is the last printable character
c = re.sub(r",\s+", ",", c)

final_dict = {}
for l in c.split("\n"):
    k,v = l.split()
    if "," in v:
        final_dict[k] = [x for x in v.split(",")]
    else:
        final_dict[k] = v

print(final_dict)

输出:

{'Mother': 'Jane', 'Father': 'Bob', 'Friends': ['Ricky', 'Jack', 'Brian', 'Jordan', 'Ricardo', 'Sonia', 'Blake']}

Python Demo

下面的方法似乎有效。它将多个行收集到一个逻辑行中,然后对其进行处理。它也不会将整个文件读入内存。你知道吗

from pprint import pprint, pformat

dic = {}
with open('./newline_file.txt') as f:
    lst = []
    for line in iter(f.readline, ''):
        line = line.strip()
        if line[-1] == '\\':  # Ends with backslash?
            lst.append(line[:-2])
            continue
        else:
            lst.append(line)
            logical_line = ''.join(lst)
            lst = []

        line_items = logical_line.split(' ')
        if len(line_items) == 2:
            if ',' in line_items[1]:
                dic[line_items[0]] = line_items[1].split(',')
            else:
                dic[line_items[0]] = line_items[1]

pprint(dic)

输出:

{'Father': 'Bob',
 'Friends': ['Ricky', 'Jack', 'Brian', 'Jordan', 'Ricardo', 'Sonia', 'Blake'],
 'Mother': 'Jane'}

您可以累积带有连续反斜杠的行,并且只处理完成后的行:

dic = {}
continued = ""
for line in content:
    if "\\" in line:
        continued += line.split("\\")[0]
        continue
    key,value = (continued+line+" ").split(" ",1)
    continued = ""
    value     =  [v.strip() for v in value.strip().split(",") if v != ""]
    dic[key]  =  value[0] if len(value)==1 else value

print(dic) # {'Mother': 'Jane', 'Father': 'Bob', 'Friends': ['Ricky', 'Jack', 'Brian', 'Jordan', 'Ricardo', 'Sonia', 'Blake']}

相关问题 更多 >