从具有结构化模式的列表构建词典

2024-09-27 02:23:57 发布

您现在位置:Python中文网/ 问答频道 /正文

如何从该文本文件中获取如下词典:

[Fri Aug 20]
shamooshak 4-0 milan
Tehran 2-0 Ams
Liverpool 0-2 Mes
[Fri Aug 19]
Esteghlal 1-0 perspolise
Paris 2-0 perspolise
[Fri Aug 20]
RahAhan 0-0 milan
[Wed Agu 11]
Munich 3-3 ABC
[Wed Agu 12]
RM 0-0 Tarakto
[Sat Jau 01]
Bayern 2-0 Manchester

我尝试过列表理解,对于带有枚举函数的循环。但我无法建立这个列表。你知道吗

我想要的词典是: {'[Fri Aug 20]':[shamooshak 4-0 milan, Tehran 2-0 Ams,Liverpool 0-2 Mes],'[Fri Aug 19]':[Esteghlal 1-0 perspolise,Paris 2-0 perspolise]。。。等等。你知道吗


Tags: 列表aug词典amsparistehranwedfri
3条回答

假设你的数据是一行行文字。。。你知道吗

def process_arbitrary_text(text):
    obj = {}
    arr = []
    k = None
    for line in text:
        if line[0] == '[' and line[-1] == ']':
            if k and arr: # omit empty keys?
                obj[k] = arr
            k = line
            arr = []
        else:
            arr.append(line)
    return obj

desired_dict = process_arbitrary_text(text)

编辑:因为您编辑说它是一个文本文件,所以只需包含以下模式

with open('filename.txt', 'r') as file:
    for line in file:
        # do something...or:
    text = file.readlines()

使用正则表达式(re模块)和示例文本:

text = '''[Fri Aug 20]
shamooshak 4-0 milan
Tehran 2-0 Ams
Liverpool 0-2 Mes
[Fri Aug 19]
Esteghlal 1-0 perspolise
Paris 2-0 perspolise
[Fri Aug 20]
RahAhan 0-0 milan
[Wed Agu 11]
Munich 3-3 ABC
[Wed Agu 12]
RM 0-0 Tarakto
[Sat Jau 01]
Bayern 2-0 Manchester'''
x = re.findall('\[.+?\][^\[]*',text)
x = [i.split('\n') for i in x]
d = dict()
for i in x:
    d[i[0]] = [j for j in i[1:] if j!='']

它给出了以下字典d

`{'[Fri Aug 20]': ['RahAhan 0-0 milan'], '[Sat Jau 01]': ['Bayern 2-0 Manchester'], '[Fri Aug 19]': ['Esteghlal 1-0 perspolise', 'Paris 2-0 perspolise'], '[Wed Agu 12]': ['RM 0-0 Tarakto'], '[Wed Agu 11]': ['Munich 3-3 ABC']}`

我忽略了日期可能会重复,正如mad所指出的那样,为了避免丢失数据,将for循环替换为

for i in x:
    d[i[0]] = []
for i in x:
    d[i[0]] = d[i[0]]+[j for j in i[1:] if j!='']

在这里,for可以成为救世主

a='''[Fri Aug 20]
shamooshak 4-0 milan
Tehran 2-0 Ams
Liverpool 0-2 Mes
[Fri Aug 19]
Esteghlal 1-0 perspolise
Paris 2-0 perspolise
[Fri Aug 20]
RahAhan 0-0 milan
[Wed Agu 11]
Munich 3-3 ABC
[Wed Agu 12]
RM 0-0 Tarakto
[Sat Jau 01]
Bayern 2-0 Manchester'''
d={}
temp_value=[]
temp_key=''
for i in a.split('\n'):

    if i.startswith('['):
        if temp_key and temp_key in d:
            d[temp_key]=d[temp_key]+temp_value
        elif temp_key:
            d[temp_key]=temp_value

        temp_key=i
        temp_value=[]

    else:
        temp_value.append(i)

print(d)

输出

{'[Fri Aug 20]': ['shamooshak 4-0 milan', 'Tehran 2-0 Ams', 'Liverpool 0-2 Mes', 'RahAhan 0-0 milan'], '[Fri Aug 19]': ['Esteghlal 1-0 perspolise', 'Paris 2-0 perspolise'], '[Wed Agu 12]': ['RM 0-0 Tarakto'], '[Wed Agu 11]': ['Munich 3-3 ABC']}

相关问题 更多 >

    热门问题