从同一个文件读入两个词典(python)

2024-06-25 11:58:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我是python新手,我正在尝试将一个文本文件读入两个字典,并将值作为一个列表。你知道吗

该文件包含以下内容:

term1  doc1 doc3 doc4
term2  doc5 doc1
term3  doc6 doc2

我试图从同一个文件创建两个字典,一个将术语作为键,值作为文档,另一个则相反。你知道吗

inverted_index = {}
forward_index = {}
with open('term_sample.txt') as file:
    for line in file:
        items = line.split()
        term, doc = items[0], items[1:]
        for doc in items[1:]
            inverted_index[term] = [doc]
            forward_index[doc] = [term]

print(inverted_index)
print(forward_index)

根据我目前所做的工作,我得到了以下结果:

{'term2': ['doc1'], 'term1': ['doc4'], 'term3': ['doc2']}
{'doc3': ['term1'], 'doc6': ['term3'], 'doc4': ['term1'], 'doc5': ['term2'], 'doc1': ['term2'], 'doc2': ['term3']}

但这是我想要的结果:

{'term1': ['doc1','doc3','doc4'], 'term2': ['doc5','doc1'], 'term3': ['doc6','doc2']}
{'doc1': ['term1','term2'], 'doc3': ['term1'], 'doc4': ['term1'], 'doc5': ['term2'], 'doc6': ['term3'], 'doc2': ['term3']}

请帮我把这个修好!你知道吗


Tags: term1indexdocdoc1itemsforwardterminverted
3条回答

inverted_index不应该在内部for,对于forward_index,您替换了每个内部for中的前一个值。请尝试以下代码:

inverted_index = {}
forward_index = {}
with open('test') as f:
    for line in f:
        items = line.split()
        term, docs = items[0], items[1:]
        inverted_index[term] = docs
        for doc in docs:
            terms = forward_index.get(doc, [])
            terms.append(term)
            forward_index[doc] = terms

print(inverted_index)
print(forward_index)

您不需要添加到内部循环中的inverted_index,每行只需添加一次。你知道吗

在内部循环中,如果字典条目已经存在,则需要附加到该条目,而不是覆盖它。你知道吗

inverted_index = {}
forward_index = {}
with open('term_sample.txt') as file:
    for line in file:
        items = line.split()
        term, doc = items[0], items[1:]
        inverted_index[term] = doc
        for doc in items[1:]
            forward_index.setdefault(doc, []).append(term)

print(inverted_index)
print(forward_index)

您可以使用来自collections模块的defaultdict(list)-因为在您的解决方案中,每次更新密钥时:

#!/usr/bin/env python 

from collections import defaultdict

inverted_index = defaultdict(list)
forward_index = defaultdict(list)
with open('term_sample.txt') as file:
    for line in file:
        items = line.split()
        term, doc = items[0], items[1:]
        for doc in items[1:]:
            inverted_index[term].append(doc)
            forward_index[doc].append(term)

print(inverted_index)
print(forward_index)

相关问题 更多 >