1

I've been using Matlab's toolbox for self-organizing maps, namely the newsom and related family of functions. I'm applying SOM clustering to a large set of documents, and I have used the plotsomhits(net, features) to visualize how many patterns/documents are assigned to each neuron. However, I cannot seem to find any functions in the toolbox that retrieve those hits in a data structure instead of just visualizing them.

Now I know that I can find the hits myself, picking the neuron that maximizes the negative distance metric for each pattern, in a simple for loop:

 nweights = net.IW{1}; % retrieve weights
 mx = -Inf; winner = 1;
 for i = 1:length(nweights)
     distance = negdist(nweights(i, :), pattern);
     if distance > mx % update index of winner 
        mx = distance;
        winner = i;
     end
 end

However it seems very odd to me that there is no available function in the SOM toolbox, given that a function for visualizing such results exists.

Does anyone know about this? Also, is there a faster method to find the neuron each pattern is 'assigned' to than the one I am describing above?

VHarisop
  • 2,816
  • 1
  • 14
  • 28
  • What are the typical sizes of the various variables used? Any minimal sample data to work with? – Divakar Jan 27 '15 at 14:06
  • Feature dimensions are 8296x1, and I have a set of 500 documents (== 500 features). The SOM size is 10x10. – VHarisop Jan 27 '15 at 14:08
  • 1
    `negdist` seems to use some sort of distance calculations. On that I could suggest the matrix-multiplication based techniques, see [here](http://stackoverflow.com/a/23911671/3293881) and [here](http://stackoverflow.com/a/26994722/3293881). – Divakar Jan 27 '15 at 14:16
  • @Divakar thank you, those are very interesting methods. – VHarisop Jan 27 '15 at 14:30

1 Answers1

0

To find the number of hits, you have to use the neural net (net) to get the outputs (y) from all of your inputs (X):

y = net(X);

Then, the "hits" for each neuron can be found by simply:

numhits = sum(y,2);

In regards to your question about "finding the neuron that is associated with each pattern": perhaps you are over thinking it? It seems that you could simply do:

y = net(pattern) neuronNumber = find(y)

I hope this makes sense. If I am incorrect here, please provide some clarification and I will adjust my answer.

Brian Goodwin
  • 391
  • 3
  • 14