在Python中将输出从二进制结果更改为频率

2024-10-01 13:44:04 发布

男 | 程序猿一只，喜欢编程写python代码。

我已经组合了许多文件（第1批）的标记，创建了一个主单词频率列表，现在我正在与一系列其他文件（第2批）进行比较。最初，我创建了一个二进制输出，如果主单词列表和批处理2中的给定文件中都有一个单词，它将输出“1”，如果没有，则输出“0”。例如[1,0,1,1]

现在，我希望它输出出现的单词的频率，即如果“cat”在主单词频率列表中出现9次，并且在文件1第2批中，它将输出“9”而不是“1”。例如[9,0,21,42]

# globalFreqSets generates a dictionary like output: ('to', 634), ('be', 604), ('and', 594)

# finalValues generates just the number element of globalFreqSets: [634, 604, 594]

output = []   
for text in doc_text:  
binarySim = []   
# creates loop to indirectly navigate through "globalFreqSets".    
# only the first item needs to be retrieved ('patient') hence the second item is set to [0] .   
for j in range(len(globalFreqSets)):  

    master_wordlist = globalFreqSets[j][0]

    i = 0
    # looping through words in list "text"
    for sub_wordlist in text:
        i += 1
        # adds 1 to "binarySim" when target word in master_wordlist is present in the sub_wordlist
        if master_wordlist == sub_wordlist:
            binarySim.append(1)
            # breaks when a match is found to avoid multiple entries per word
            break
        # adds 0 to "binarySim" when target word in master_wordlist is not present in the sub_wordlist
        elif i == len(text):
            binarySim.append(0)
# adding "binarySim" to "output"
output.extend([binarySim])

抱歉，如果这是错误的格式或措辞，我还是相当新的编码：）

Tags：文件 the to text in master 列表 for

0条回答

目前没有回答

在Python中将输出从二进制结果更改为频率

相关问题更多 >

编程相关推荐

热门问题

热门文章

在Python中将输出从二进制结果更改为频率

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >