擅长:python、mysql、java
<p>您应该使用来自<code>scipy.spatial</code>的优化的<code>cdist</code>,这比使用numpy计算效率更高</p>
<pre><code>from scipy.spatial.distance import cdist
dist = cdist(data, C, metric='euclidean')
dist_idx = np.argmin(dist, axis=1)
</code></pre>
<p>一个更优雅的解决方案是使用<code>scipy.spatial.cKDTree</code>(正如@Saullo Castro在评论中指出的那样),对于大型数据集来说,这可能更快</p>
<pre><code>from scipy.spatial import cKDTree
tr = cKDTree(C)
dist, dist_idx = tr.query(data, k=1)
</code></pre>