3

Let me explain what I'm trying to do. I have plot of an Image's points/pixels in the RGB space. What I am trying to do is find elongated clusters in this space. I'm fairly new to clustering techniques and maybe I'm not doing things correctly, I'm trying to cluster using MATLAB's inbuilt k-means clustering but it appears as if that is not the best approach in this case.

What I need to do is find "color clusters".

This is what I get after applying K-means on an image. enter image description here

This is how it should look like:

enter image description here

for an image like this:

enter image description here

Can someone tell me where I'm going wrong, and what I can to do improve my results?


Note: Sorry for the low-res images, these are the best I have.

Shai
  • 111,146
  • 38
  • 238
  • 371
ffledgling
  • 11,502
  • 8
  • 47
  • 69

3 Answers3

2

Are you trying to replicate the results of this paper? I would say just do what they did.

However, I will add since there are some issues with the current answers.

1) Yes, your clusters are not spherical- which is an assumption k-means makes. DBSCAN and MeanShift are two more common methods for handling such data, as they can handle non spherical data. However, your data appears to have one large central clump that spreads outwards in a few finite directions.

For DBSCAN, this means it will put everything into one cluster, or everything is its own cluster. As DBSCAN has the assumption of uniform density and requires that clusters be separated by some margin.

MeanShift will likely have difficulty because everything seems to be coming from one central lump - so that will be the area of highest density that the points will shift toward, and converge to one large cluster.

My advice would be to change color spaces. RGB has issues, and it the assumptions most algorithms make will probably not hold up well under it. What clustering algorithm you should be using will then likely change in the different feature space, but hopefully it will make the problem easier to handle.

Community
  • 1
  • 1
Raff.Edward
  • 6,404
  • 24
  • 34
  • That _is_ the paper I am trying to implement as part of another paper, but I'm having serious difficulty understanding what the author(s) are doing. – ffledgling Nov 16 '13 at 22:31
  • What you asked is not the same thing as what is done in that paper. If you have questions or parts you don't understand in the paper - that would be a different set of questions. Knowing background also helps - but the paper is pretty straight forward. While the exact parameter values used is not stated, they give a step by step outline in section 3. – Raff.Edward Nov 16 '13 at 22:37
  • I seem to be mis-interpreting the paper then, I thought section 3 was applied *after* having obtained the orthogonal clusters. – ffledgling Nov 16 '13 at 22:41
  • 3
    You should speak with your advisor, you need help learning how to read papers and understanding the perquisite knowledge. There is nothing about orthogonality in that paper, and no step could be described as finding any components orthogonal to each other. – Raff.Edward Nov 16 '13 at 22:46
  • 1. Correction about orthogonal clusters, I meant clusters approximately orthogonal to spherical shells. (Fig 7.) 2. It was my advisor who told me to use some form of K-means clustering to find the elongated clusters. :-/ – ffledgling Nov 16 '13 at 22:48
  • 1
    Seek a new advisor. I've answered the question that you did ask. If you don't understand wants going on in the paper - you need more than just getting a question or 2 answered. If your advisor thought k-means was going to replicate the paper's results - they either 1) didn't read it, 2) doesn't care, or 3) doesn't know what they are talking about. – Raff.Edward Nov 16 '13 at 22:53
  • Indeed. Thank you for all the info. – ffledgling Nov 16 '13 at 22:54
  • @Raff.Edward I think you are too hard on this guy's supervisor when you actually had no contact with him. It's harsh to speak of a person that way with so little knowledge about the details of his supervision. – Shai Nov 17 '13 at 06:37
  • @Shai Probably. I know only half the info (not even really). If I take his statements at face value - then there is a problem. If his version is not accurate, than that is a different problem. – Raff.Edward Nov 17 '13 at 08:52
  • @Raff.Edward **if** his version is accurate... That is a very big **if** when you know at most half the story. How would you feel if you were this guy's supervisor? I do believe he needs to have more thorough discussions with his professor, but I would give his professor at least some credit. – Shai Nov 17 '13 at 08:58
  • @Raff.Edward and Shai, you're both being a little harsh, let me clarify, Machine Learning and AI is not my advisor's field of specialization and nor is it mine. I have had discussions with my advisor, and he directed me as best as he could, If solving the problem I was facing were as easy as talking to my advisor, believe me I would've done so a long time ago. – ffledgling Nov 17 '13 at 17:59
1

Take a look at density-based clustering algorithms, such as DBSCAN and MeanShift. If you are doing this for segmentation, you might want to add pixel coordinates to your vectors.

Don Reba
  • 13,814
  • 3
  • 48
  • 61
1

k-means basically assumes clusters are approximately spherical. In your case they are definitely NOT. Try fit a Gaussian to each cluster with non-spherical covariance matrix. Basically, you will be following the same expectation-maximization (EM) steps as in k-means with the only exception that you will be modeling and fitting the covariance matrix as well.

Here's an outline for the algorithm

  1. init: assign each point at random to one of k clusters.
  2. For each cluster estimate mean and covariance
  3. For each point estimate its likelihood to belong to each cluster
    note that this likelihood is based not only on the distance to the center (mean) but also on the shape of the cluster as it is encoded by the covariance matrix
  4. repeat stages 2 and 3 until convergence or until exceeded pre-defined number of iterations
Shai
  • 111,146
  • 38
  • 238
  • 371