I have 300 collection points and i need to clustering it based on GEO COORDINATE. But all my cluster should have a upper ceiling of 8 lower ceiling of 5. How can I do that in Python.
Asked
Active
Viewed 185 times
1
-
1Please share a required output and explain what you want. – vinsent paramanantham Sep 02 '20 at 09:35
-
I want output like this, Latitude Longitude Route Code 18.2521536 76.4982399 Cluster_01 18.2526484 76.4976308 Cluster_01 18.2526006 76.4972857 Cluster_01 18.2533365 76.4975484 Cluster_01 18.2535941 76.4987773 Cluster_01 18.2535462 76.4986933 Cluster_01 18.2503783 76.5116291 Cluster_02 18.2512383 76.5085317 Cluster_02 18.2506268 76.5082113 Cluster_02 18.2516204 76.5064285 Cluster_02 I have 300 such Coordinates which has to be clustered with maximum cluster size of 8 min of 6 – Sugumar S Sep 07 '20 at 11:00
1 Answers
0
My question answers your question. You need to change the position
whit your GEO COORDINATE
data and x,y
with Latitude Longitude
.
dfcluster = DataFrame(position, columns=['x', 'y'])
kmeans = KMeans(n_clusters=4).fit(dfcluster)
centroids = kmeans.cluster_centers_
#for plot
# plt.scatter(dfcluster['x'], dfcluster['y'], c=kmeans.labels_.astype(float), s=50, alpha=0.5)
# plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
# plt.show()
dfcluster['cluster'] = kmeans.labels_
dfcluster=dfcluster.drop_duplicates(['x', 'y'], keep='last')
dfcluster = dfcluster.sort_values(['cluster', 'x', 'y'], ascending=True)
n=8
dfcluster1=dfcluster.head(n)
n=5
dfcluster2=dfcluster.tail(n)
Also, for equal group use, the Size Constrained Clustering solver
Start with pip install size-constrained-clustering
or pip install git+https://github.com/jingw2/size_constrained_clustering.git
and you can use minmax flow
or Heuristics
n_samples = 2000
n_clusters = 3
X = np.random.rand(n_samples, 2)
model = equal.SameSizeKMeansMinCostFlow(n_clusters)
#model = equal.SameSizeKMeansHeuristics(n_clusters)
model.fit(X)
centers = model.cluster_centers_
labels = model.labels_

I_Al-thamary
- 3,385
- 2
- 24
- 37