使用python查找txt fi中的重复名称

========= Weekend of 2016-12-02: ================ Schedule1: bob@email Schedule2: john@email bob@email Schedule3: Terry@email ========= Weekend of 2016-12-09: ================ Schedule1: jake@email Schedule2: mike@email bob@email Schedule3: howard@email

2条回答

网友

1楼 · 编辑于 2024-06-01 07:41:43

您可以使用正则表达式创建集合的dict来执行以下操作：

import re
from collections import Counter

data={}

with open(fn) as f_in:
    txt=f_in.read()

for block in re.finditer(r'^=+\s+([^:]+:)\s=+\s+([^=]+)', txt, re.M):
    di={}
    for sc in re.finditer(r'^(Schedule\s*\d+):\s*([\s\S]+?)(?=(?:^Schedule\s*\d+)|\Z)', block.group(2), re.M):
        di[sc.group(1)]=set(sc.group(2).splitlines())
    data[block.group(1)]=di

for date, DofS in data.items():
    c=Counter()
    for s in DofS.values():
        c+=Counter(s)
    inverted={k:[] for k, v in c.items() if v>1} 
    if not inverted:
        continue
    print date  
    for k in DofS:
        for e in DofS[k]:
            if e in inverted:
                inverted[e].append(k)    
    print "\t",inverted

印刷品：

Weekend of 2016-12-02:
    {'bob@email': ['Schedule1', 'Schedule2']}

网友

2楼 · 编辑于 2024-06-01 07:41:43

我想你可以用地图来存储<name, list of schedule>，比如<bob@email, [Schedule1]>，当你度过每个周末的时候。每次要添加新项时，可以检查是否已经设置了密钥。如果是，则将该时间表添加到相应的列表中。如果否，则向该映射添加新项。然后，在打印时，只打印列表中具有多个计划的项目。你知道吗

对于Python，可以使用dictionary作为映射。 https://www.tutorialspoint.com/python/python_dictionary.htm

相关问题更多 >

编程相关推荐

热门问题

热门文章