0

I am implementing KMeans algorithm using numpy.

I am making a numpy array named distances like this:

[[ 5.  1.  1.  1.  2.  1.  3.  1.  1.  1.]
 [ 5.  4.  4.  5.  7. 10.  3.  2.  1.  0.]
 [ 3.  1.  1.  1.  2.  2.  3.  1.  1.  1.]
 [ 6.  8.  8.  1.  3.  4.  3.  7.  1.  1.]
 [ 4.  1.  1.  3.  2.  1.  3.  1.  1.  1.]
 [ 8. 10. 10.  8.  7. 10.  9.  7.  1.  0.]
 [ 1.  1.  1.  1.  2. 10.  3.  1.  1.  0.]
 [ 2.  1.  2.  1.  2.  1.  3.  1.  1.  1.]
 [ 2.  1.  1.  1.  2.  1.  1.  1.  5.  1.]
 [ 4.  2.  1.  1.  2.  1.  2.  1.  1.  1.]]

Where first 9 columns are data points and last column is the cluster the data point gets assigned to for random centroids initialized.

In this array I would like to see these values, 0,1,2 in last column. As in the given array above we can only see 0,1 in last column. In this case I intend to change half of the most common value from last column to 2.

k=3
for c in range(k):
    if c in distances[:, -1]:
    else:
        x = np.bincount(distances[:,-1]).argmax()
        distances[:len(distances[distances[:,-1]==x])/2,-1][distances[:,-1] == x] = c

However this is not working. Can someone help me fix this problem ?

error -> IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 10

R_Moose
  • 103
  • 9
  • Does the [`edited answer`](https://stackoverflow.com/a/55608919/) to your previous question answer this one? – Divakar Apr 11 '19 at 04:33
  • No, although it helped me compute a new centroid array. Some of those rows are 0 because in my distanes matrices I dont see some of the cluster values. – R_Moose Apr 11 '19 at 06:45
  • I have changed the question to a specific problem. Can you help me now please ? – R_Moose Apr 11 '19 at 08:49

1 Answers1

1

I think this might help you

If distance is the variable which has the array

x=np.unique(distance[:,-1]).argmax()
pos=np.argwhere(distance[:,-1]==x).flatten()
for i in range(int(len(pos)/2)):
    distance[i,-1]=2

I hope this helps!

Sridhar Murali
  • 380
  • 1
  • 11