擅长:python、mysql、java
<p>我没有你的数据,所以我只是把500个随机数分成三列。</p>
<pre><code>import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.vq import kmeans2, whiten
arr = np.random.randn(500000*3).reshape((500000, 3))
x, y = kmeans2(whiten(arr), 7, iter = 20) #<--- I randomly picked 7 clusters
plt.scatter(arr[:,0], arr[:,1], c=y, alpha=0.33333);
out[1]:
</code></pre>
<p><img src="https://i.stack.imgur.com/EE6kn.png" alt="enter image description here"/></p>
<p>我给它计时,运行Kmeans2用了1.96秒,所以我认为这与数据的大小无关。将数据放入500000x 3 numpy数组并尝试kmeans2。</p>