基于scikit学习生成的预计算余弦相似矩阵的层次聚类

2024-07-05 14:43:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我们想把余弦相似度和层次聚类结合起来,我们已经计算出了余弦相似度。 在sklearn.cluster.aggregativeclustering文件上写着:

A distance matrix (instead of a similarity matrix) is needed as input for the fit method.

所以,我们把余弦相似性转换为距离

distance = 1 - similarity

我们的python代码在末尾的fit()方法产生错误。(我没有在代码中写X的实际值,因为它非常大。)X只是一个余弦相似矩阵,其值转换为上面所述的距离。注意对角线,都是0。)下面是代码:

^{pr2}$

错误是:

runfile('/Users/stackoverflowuser/Desktop/4.2/Pr/untitled0.py', wdir='/Users/stackoverflowuser/Desktop/4.2/Pr')
Traceback (most recent call last):

  File "<ipython-input-1-b8b98765b168>", line 1, in <module>
    runfile('/Users/stackoverflowuser/Desktop/4.2/Pr/untitled0.py', wdir='/Users/stackoverflowuser/Desktop/4.2/Pr')

  File "/anaconda2/lib/python2.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 704, in runfile
    execfile(filename, namespace)

  File "/anaconda2/lib/python2.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 100, in execfile
    builtins.execfile(filename, *where)

  File "/Users/stackoverflowuser/Desktop/4.2/Pr/untitled0.py", line 84, in <module>
    cluster.fit(X)

  File "/anaconda2/lib/python2.7/site-packages/sklearn/cluster/hierarchical.py", line 795, in fit
    (self.affinity, ))

ValueError: precomputed was provided as affinity. Ward can only work with euclidean distances.

我能提供什么吗?谢谢你了。在


Tags: 代码inpyliblineprusersfit
1条回答
网友
1楼 · 发布于 2024-07-05 14:43:06

根据sklearn的文件:

If linkage is “ward”, only “euclidean” is accepted. If “precomputed”, a distance matrix (instead of a similarity matrix) is needed as input for the fit method.

因此,您需要将链接更改为“完整”、“平均”或“单个”。在

答案来自: https://datascience.stackexchange.com/questions/51970/hierarchical-clustering-with-precomputed-cosine-similarity-matrix-using-scikit-l/

相关问题 更多 >