在python中计算两个wordfrequencydictionary的余弦相似度的正确方法？

2条回答

网友

1楼 · 编辑于 2024-06-02 09:39:01

使用pandas和scipy

import pandas as pd
from scipy.spatial.distance import cosine

line_dict = {'Karl': 1, 'Donald': 1, 'Ifwerson': 1, 'Trump': 0}
query_dict = {'Karl': 0, 'Donald': 1, 'Ifwerson': 0, 'Trump': 1}

line_s = pd.Series(line_dict)
query_s = pd.Series(query_dict)

print(1 - cosine(line_s, query_s))

此代码将输出0.40824829046386291

我不明白你所说的“订单”是什么意思，所以我还没有处理过这个问题，但是这段代码对你来说应该是个好的开始。在

网友

2楼 · 编辑于 2024-06-02 09:39:01

您不需要为Cosine similarity订购字典，简单查找就足够了：

import math

def cosine_dic(dic1,dic2):
    numerator = 0
    dena = 0
    for key1,val1 in dic1:
        numerator += val1*dic2.get(key1,0.0)
        dena += val1*val1
    denb = 0
    for val2 in dic2.values():
        denb += val2*val2
    return numerator/math.sqrt(dena*denb)

您只需使用.get(key1,0.0)来查找元素是否存在，如果不存在，则假定为0.0。因此，dic1和{}都不需要以0作为值来存储值。在

回答您的其他问题：

How can I set the key-value pairs and have access to them afterwards?

你只需声明：

^{pr2}$

How can I increment the value of a certain key?

如果您确定密钥已经是字典的一部分：

dic[key] +=  1

否则，您可以使用：

dic[key] = dic.get(key,0)+1

Or is there any other more easier way to do this?

您可以使用Counter，它基本上是一个具有一些附加功能的字典。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python中计算两个wordfrequencydictionary的余弦相似度的正确方法？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >