回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我在地图上有特定的点,我需要将它们分组到具有相同大小的不同簇,最后一个簇可以是<code>count %n</code>。我读了这些答案<a href="https://stats.stackexchange.com/questions/8744/clustering-procedure-where-each-cluster-has-an-equal-number-of-points">1</a>、<a href="https://github.com/ndanielsen/Same-Size-K-Means/blob/master/tests/test_equal_groups.py" rel="nofollow noreferrer">2</a>、和<a href="https://stackoverflow.com/questions/5452576/k-means-algorithm-variation-with-equal-cluster-size">3</a>,但没有帮助。我尝试过不同的方法,但没有一种有效。在这段代码中,我指定了<code>n_clusters=4</code>,因为这是一个集群的最佳数量,我可以对它们进行排序,并从排序的点获取<code>n</code>个最佳点,然后我将遍历所有点。例如,我需要图中所示的<code>32</code>点是集群到<code>4</code>集群,并且每个集群都有<code>8</code>点
<a href="https://i.stack.imgur.com/pBpRz.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/pBpRz.png" alt="enter image description here"/></a></p>
<pre><code>dfcluster = DataFrame(position, columns=['x', 'y'])
kmeans = KMeans(n_clusters=4).fit(dfcluster)
centroids = kmeans.cluster_centers_
# plt.scatter(dfcluster['x'], dfcluster['y'], c=kmeans.labels_.astype(float), s=50, alpha=0.5)
# plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
# plt.show()
dfcluster['cluster'] = kmeans.labels_
dfcluster=dfcluster.drop_duplicates(['x', 'y'], keep='last')
dfcluster = dfcluster.sort_values(['cluster', 'x', 'y'], ascending=True)
# d=pd.DataFrame()
# m = pd.DataFrame()
# n=8
# for x in range(4) :
# m= dfcluster[dfcluster.cluster == x]
#
#
# if len(m) > int( n /2)-1:
# m=m.head(int(n/2)-1)
# # for idx, row in m.iterrows():
# # print("code3 group", "=", row['cluster'])
# d=d.append(m,ignore_index = True)
#
# else :
# d=d.append(m,ignore_index = True)
#
#
# if len(d)>=n:
# dfcluster = d
# dfcluster.groupby('cluster').nth(n))
dfcluster=dfcluster.head(n)
i=0
if (len(dfcluster )< n):
change_df()
</code></pre>