如何使用loop获取列表对象的词频并存储在dict对象中?

2024-10-02 00:36:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个名为data的列表和一个名为word_count的dict对象,在将频率转换为唯一整数之前,我想返回一个dict对象word_count(预期格式:{'marjori': 1,'splendid':1...}),然后对频率进行排序

data = [['marjori',
 'splendid'],
 ['rivet',
 'perform',
 'farrah',
 'fawcett']]

def build_dict(data, vocab_size = 5000):

    word_count = {}
    for w in data:
        word_count.append(data.count(w)) ????
    #print(word_count)

    # how can I sort the words to make sorted_words[0] is the most frequently appearing word and sorted_words[-1] is the least frequently appearing word.

    sorted_words = ??

我是Python新手,有人能帮我吗,提前谢谢。(我只想使用numpy库和for循环。)


Tags: the对象fordataiscountdictword
2条回答

对于每个单词,如果它还不存在,则需要创建一个dict条目,如果它确实存在,则需要在其值中添加1:

 word_count = dict()
        for w in data:
            if word_count.get(w) is not None:
                word_count[w] += 1
            else:
                word_count[w] = 1

然后,您可以按值对词典进行排序:

word_count = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1], reverse=True)}

代码的最后一部分是不可理解的,但如果您只想计算单词数并将其插入字典,并按频率降序排序,我建议使用defaultdict并按如下方式实现:

data = ['marjori',
 'splendid',
 'rivet',
 'farrah',
 'perform',
 'farrah',
 'fawcett']
from collections import defaultdict

def build_dict(data, vocab_size = 5000):
    """Construct and return a dictionary mapping each of the most frequently appearing words to a unique integer."""

    word_count = defaultdict(int) # A dict storing the words that appear in the reviews along with how often they occur
    for w in data:
        word_count[w]+=1
    #print(word_count)

    # how can I sort the words to make sorted_words[0] is the most frequently appearing word and sorted_words[-1] is the least frequently appearing word.

    sorted_words = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1])}

    return sorted_words

build_dict(data)

输出:

{'farrah': 2,
 'fawcett': 1,
 'marjori': 1,
 'perform': 1,
 'rivet': 1,
 'splendid': 1}

相关问题 更多 >

    热门问题