如何根据大小过滤DBSCAN产生的簇？

clustering = DBSCAN(eps=0.1, min_samples=20, metric='euclidean').fit(only_xy) plt.scatter(only_xy[:, 0], only_xy[:, 1], c=clustering.labels_, cmap='rainbow') clusters = clustering.components_ #Store the labels labels = clustering.labels_ #Then get the frequency count of the non-negative labels counts = np.bincount(labels[labels>=0]) print(counts) Output: [1278 564 208 47 36 30 191 54 24 18 40 915 26 20 24 527 56 677 63 57 61 1544 512 21 45 187 39 132 48 55 160 46 28 18 55 48 35 92 29 88 53 55 24 52 114 49 34 34 38 52 38 53 69]

2条回答

网友

1楼 · 编辑于 2024-10-05 14:28:06

我想如果你运行这个代码，你可以得到标签，和集群的组件，集群的大小超过100：

from collections import Counter
labels_with_morethan100=[label for (label,count) in Counter(clustering.labels_).items() if count>100]
clusters_biggerthan100= clustering.components_[np.isin(clustering.labels_[clustering.labels_>=0], labels_with_morethan100)]

网友

2楼 · 编辑于 2024-10-05 14:28:06

您可以找到计数小于100的标签索引：

ls, cs = np.unique(labels,return_counts=True)
dic = dict(zip(ls,cs))
idx = [i for i,label in enumerate(labels) if dic[label] <100 and label >= 0]

然后，您可以将结果索引应用于您的DBSCAN结果和标签，例如（或多或少）：

plt.scatter(only_xy[idx, 0], only_xy[idx, 1],
        c=clustering.labels_[idx], cmap='rainbow')

相关问题更多 >

编程相关推荐

热门问题

热门文章