如何使用loop获取列表对象的词频并存储在dict对象中？

data = [['marjori', 'splendid'], ['rivet', 'perform', 'farrah', 'fawcett']] def build_dict(data, vocab_size = 5000): word_count = {} for w in data: word_count.append(data.count(w)) ???? #print(word_count) # how can I sort the words to make sorted_words[0] is the most frequently appearing word and sorted_words[-1] is the least frequently appearing word. sorted_words = ??

2条回答

网友

1楼 · 编辑于 2024-10-02 00:36:00

对于每个单词，如果它还不存在，则需要创建一个dict条目，如果它确实存在，则需要在其值中添加1：

 word_count = dict()
        for w in data:
            if word_count.get(w) is not None:
                word_count[w] += 1
            else:
                word_count[w] = 1

然后，您可以按值对词典进行排序：

word_count = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1], reverse=True)}

网友

2楼 · 编辑于 2024-10-02 00:36:00

代码的最后一部分是不可理解的，但如果您只想计算单词数并将其插入字典，并按频率降序排序，我建议使用defaultdict并按如下方式实现：

data = ['marjori',
 'splendid',
 'rivet',
 'farrah',
 'perform',
 'farrah',
 'fawcett']
from collections import defaultdict

def build_dict(data, vocab_size = 5000):
    """Construct and return a dictionary mapping each of the most frequently appearing words to a unique integer."""

    word_count = defaultdict(int) # A dict storing the words that appear in the reviews along with how often they occur
    for w in data:
        word_count[w]+=1
    #print(word_count)

    # how can I sort the words to make sorted_words[0] is the most frequently appearing word and sorted_words[-1] is the least frequently appearing word.

    sorted_words = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1])}

    return sorted_words

build_dict(data)

输出：

{'farrah': 2,
 'fawcett': 1,
 'marjori': 1,
 'perform': 1,
 'rivet': 1,
 'splendid': 1}

相关问题更多 >

编程相关推荐

热门问题

热门文章