分裂分析（DIANA）层次聚类

2024-10-04 01:27:21 发布

您现在位置：Python中文网/ 问答频道 /正文

9024

网友

男 | 程序猿一只，喜欢编程写python代码。

（这篇文章是我关于除法层次聚类算法的previous问题的延续。）

问题是如何用Python（或任何其他语言）实现这个算法。在

算法描述

除法聚类通过一系列连续的分裂进行。在步骤0，所有对象都在一个簇中。在每一步，一个集群被分割，直到在步骤n - 1所有数据对象都是分开的（形成n簇，每个簇都有一个对象）。在

每一步都将一个簇分成两个簇R和{}。最初，A等于R，B为空。在第一阶段，我们必须将一个对象从A移到{}。对于A的每个对象i，我们计算与A所有其他对象的平均差异：

上面等式达到最大值的对象i'将被移动，因此我们将

在下一个阶段中，我们寻找从A到{}的其他点。只要A仍然包含多个对象，我们就计算

对于A的每个对象i，我们考虑使这个数量最大化的对象{}。当上面等式的最大值严格为正时，我们将i''从A移到{}，然后在新的A中寻找可能移动的另一个对象。另一方面，当差的最大值为负或0时，我们停止该过程并完成R到{}和{}的划分。在

在除法算法的每一步，我们还必须决定要分割哪个簇。为此，我们计算直径

对于上一步之后可用的每个簇Q，并选择直径最大的簇。在

我的起始代码粘贴在下面。实际上，脚本返回ValueError: list.remove(x): x not in list。在

# Dissimilarity matrix
dm = [
[ 0, 2, 6, 10, 9],
[ 2, 0, 5,  9, 8],
[ 6, 5, 0,  4, 5],
[10, 9, 4,  0, 3],
[ 9, 8, 5,  3 ,0]]

# Create splinter and remaining group
splinter = []
remaining = range(len(dm))

# Find record ra that has the greatest average distance from the rest of the records
ra = 0
dMax = -999
for i in range(len(dm)):
  dSum = 0
  for j in range(len(dm)):
    if i == j:
      continue
    dSum += dm[i][j]
  if dMax < dSum:
    dMax = dSum
    ra = i

splinter.append(ra)
remaining.remove(ra)

# Check every record in remaining  and moves the record if record is closer to splinter
bChanged = True
while bChanged:
  bChanged = False
  for i in range(len(remaining)):
    d1 = 0.0
    for j in range(len(splinter)):
      d1 += dm[i][j]
    d1 /= float(len(splinter))
    d2 = 0.0
    for k in range(len(remaining)):
      if i == k:
        continue
      d2 += dm[i][k]
    if len(remaining) > 1:
      d2 /= (len(remaining) - 1.0)
    if d1 < d2:
      bChanged = True
      splinter.append(i)
      remaining.remove(i)
      break

Tags： the 对象 in 算法 for len if range

0条回答

目前没有回答

分裂分析（DIANA）层次聚类

相关问题更多 >

编程相关推荐

热门问题

热门文章

分裂分析（DIANA）层次聚类

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >