更有效地获取最近的cen

def getNearestCenter(data,centers): if centers.shape != (1,2): dist_ = np.sqrt(np.sum(np.power(data-centers,2),axis=1)) # This compute distance between data and all centers center = centers[np.argmin(dist_)] # this return center which have the minimum distance from data else: center=centers[0] return center

def getLabel(dataPoint, C, history): labels = [] cluster = getNearestCenter(dataPoint.data,C) for x in history: if np.all(getNearestCenter(x.data,C) == cluster): labels.append(x.true_label) return labels

2条回答

网友

1楼 · 编辑于 2024-10-01 02:40:35

您应该使用来自scipy.spatial的优化的cdist，这比使用numpy计算效率更高

from scipy.spatial.distance import cdist

dist = cdist(data, C, metric='euclidean')
dist_idx = np.argmin(dist, axis=1)

一个更优雅的解决方案是使用scipy.spatial.cKDTree（正如@Saullo Castro在评论中指出的那样），对于大型数据集来说，这可能更快

from scipy.spatial import cKDTree

tr = cKDTree(C)
dist, dist_idx = tr.query(data, k=1)

网友

2楼 · 编辑于 2024-10-01 02:40:35

找到了：

dist_ = np.argmin(np.sqrt(np.sum(np.power(data[:, None]-C,2),axis=2)),axis=1)

它应该从data的每个数据点返回centers中最近中心的索引。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章