14

I have been doing reading about Self Organizing Maps, and I understand the Algorithm(I think), however something still eludes me.

How do you interpret the trained network?

How would you then actually use it for say, a classification task(once you have done the clustering with your training data)?

All of the material I seem to find(printed and digital) focuses on the training of the Algorithm. I believe I may be missing something crucial.

Regards

Jack H
  • 2,440
  • 4
  • 40
  • 63

1 Answers1

27

SOMs are mainly a dimensionality reduction algorithm, not a classification tool. They are used for the dimensionality reduction just like PCA and similar methods (as once trained, you can check which neuron is activated by your input and use this neuron's position as the value), the only actual difference is their ability to preserve a given topology of output representation.

So what is SOM actually producing is a mapping from your input space X to the reduced space Y (the most common is a 2d lattice, making Y a 2 dimensional space). To perform actual classification you should transform your data through this mapping, and run some other, classificational model (SVM, Neural Network, Decision Tree, etc.).

In other words - SOMs are used for finding other representation of the data. Representation, which is easy for further analyzis by humans (as it is mostly 2dimensional and can be plotted), and very easy for any further classification models. This is a great method of visualizing highly dimensional data, analyzing "what is going on", how are some classes grouped geometricaly, etc.. But they should not be confused with other neural models like artificial neural networks or even growing neural gas (which is a very similar concept, yet giving a direct data clustering) as they serve a different purpose.

Of course one can use SOMs directly for the classification, but this is a modification of the original idea, which requires other data representation, and in general, it does not work that well as using some other classifier on top of it.

EDIT

There are at least few ways of visualizing the trained SOM:

  • one can render the SOM's neurons as points in the input space, with edges connecting the topologicaly close ones (this is possible only if the input space has small number of dimensions, like 2-3)
  • display data classes on the SOM's topology - if your data is labeled with some numbers {1,..k}, we can bind some k colors to them, for binary case let us consider blue and red. Next, for each data point we calculate its corresponding neuron in the SOM and add this label's color to the neuron. Once all data have been processed, we plot the SOM's neurons, each with its original position in the topology, with the color being some agregate (eg. mean) of colors assigned to it. This approach, if we use some simple topology like 2d grid, gives us a nice low-dimensional representation of data. In the following image, subimages from the third one to the end are the results of such visualization, where red color means label 1("yes" answer) andbluemeans label2` ("no" answer)
  • onc can also visualize the inter-neuron distances by calculating how far away are each connected neurons and plotting it on the SOM's map (second subimage in the above visualization)
  • one can cluster the neuron's positions with some clustering algorithm (like K-means) and visualize the clusters ids as colors (first subimage)

source:wikipedia

lejlot
  • 64,777
  • 8
  • 131
  • 164
  • Thank you for the fantastic answer! The other thing I do not understand is how to visualize the model once trained. – Jack H Aug 15 '13 at 10:18
  • updated answer with some visualization techniques information – lejlot Aug 15 '13 at 10:39
  • @lejlot You say `But they should not be confused with other neural models like artificial neural networks or even growing neural gas `. But when I looked at the algorithms on neural gas and SOM they look so similar. I can't see what is the difference? Is neural gas also a clustering method like SOM and not a direct classification tool? Can you please explain the difference between neural gas and SOM – sam_rox Dec 04 '15 at 10:04
  • Sure, GNG is a modification of SOM idea, where you do not have fixed network, but instead you grow it to your data. Consequently GNG is more a local data clustering method, or more formally - vector quantization. Consequently you do not get planar representation (you often cannot plot GNG, as it has no natural 2d/3d structure). SOM is in fact more related to a PCA (and its great generalization - Principal Manifolds) than neural networks – lejlot Dec 04 '15 at 10:37
  • @lejlot Thanks for the reply.So Neural Gas are similar to SOM, but they do not have a solid lattice structure like SOM and the neurons in neural gas are spread like gas particles. GNG is also unsupervised learning method that is used for clustering than classification.In Neural Gas instead of the neighborhood function used in SOM they use a **distance ranking** right? Is that why you say that GNG is a local data clustering method? Isn't vector quantization used with SOM's also? LVQ is used for fine tuning SOM's so similarly can vector quantization be used in neural gas? – sam_rox Dec 08 '15 at 03:23
  • @lejlot `To perform actual classification you should transform your data through this mapping, and run some other, classificational model `. I read on SOM's and the book said that after clustering using SOM, for classification use a supervised learning scheme and one such scheme is Learning Vector Quantization (LVQ). So the input to LVQ would be SOM and output would be classified data. But since LVQ is supervised we need to give as input **labelled data**. How can SOM produce label data if it doesn't perform classification and only clustering? How can SOM be used with LVQ for classification – clarkson Dec 10 '15 at 05:43
  • How can we map original high dimensional space coordinates into topological coordinates in more precise way then locating closest SOM neuron, i.e. how to get fractional topological coordinates? – Alexey Tigarev Feb 03 '19 at 11:07
  • Wonderful answer! Thank you – Gihan Gamage Dec 28 '19 at 16:51