2

I'm trying to cluster some 3D points with the help of some given coordinates using DBSCAN algorithm with python.

ex:- given coordinates will be like follows

  X      Y      Z

 [-37.530  3.109  -16.452]
 [40.247  5.483  -15.209]
 [-31.920 12.584  -12.916] 
 [-32.760 14.072  -13.749]
 [-37.100  1.953  -15.720] 
 [-32.143 12.990  -13.488]
 [-41.077  4.651  -15.651] 
 [-34.219 13.611  -13.090]
 [-33.117 15.875  -13.738]  e.t.c

I'm kind of new to programming and searching for an example script how to write the codes. Can some one give a suggestion or an example? Thanks a lot in advance.

sentence
  • 8,213
  • 4
  • 31
  • 40
bob
  • 75
  • 1
  • 8

1 Answers1

7

You can use sklearn.cluster.DBSCAN. In your case:

import numpy as np
import matplotlib.pyplot as plt
#%matplotlib inline
from mpl_toolkits.mplot3d import Axes3D
from sklearn.cluster import DBSCAN

data = np.array([[-37.530, 3.109, -16.452],
                [40.247, 5.483, -15.209],
                [-31.920, 12.584, -12.916],
                [-32.760, 14.072, -13.749],
                [-37.100, 1.953, -15.720],
                [-32.143, 12.990, -13.488],
                [-41.077, 4.651, -15.651], 
                [-34.219, 13.611, -13.090],
                [-33.117, 15.875, -13.738]])

fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(data[:,0], data[:,1], data[:,2], s=300)
ax.view_init(azim=200)
plt.show()

model = DBSCAN(eps=2.5, min_samples=2)
model.fit_predict(data)
pred = model.fit_predict(data)

fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(data[:,0], data[:,1], data[:,2], c=model.labels_, s=300)
ax.view_init(azim=200)
plt.show()

print("number of cluster found: {}".format(len(set(model.labels_))))
print('cluster for each point: ', model.labels_)

ouput

  • before clustering

enter image description here

  • after clustering

enter image description here

number of cluster found: 3
cluster for each point:  [ 0 -1  1  1  0  1 -1  1  1]
sentence
  • 8,213
  • 4
  • 31
  • 40
  • In addition, you may consider using the OPTICS algorithm which performs better on larger datasets: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.OPTICS.html#sklearn.cluster.OPTICS – Vincent Cadoret Nov 03 '20 at 16:13