1

I was working on webots which is an environment used to model, program and simulate mobile robots. Basically i have a small robot with a VGA camera, and it looks for simple blue coloured patterns on white walls of a small lego maze and moves accordingly

The method I used here was ​

  • Obtain images of the patterns from webots and save it in a location in PC.
  • ​​Detect the blue pattern, form a square enclosing the pattern with atleast 2 edges of the pattern being part of the boundary of the square.
  • ​Resize it to 7x7 matrix(using nearest neighbour interpolation algorithm)

  • The input to the network is nothing but the red pixel intensities of each of the 7x7 image(when i look at the blue pixel through a red filter it appears black so). The intensities of each pixel is extracted and the 7x7 matrix is then converted it to a 1D vector i.e 1x49 which is my input to the neural network. (I chose this characteristic as my input because it is 'relatively' less difficult to access this information using C and webots.​​)

I used MATLAB for this offline training method and I used a slower learning rate(0.06) to ensure parameter convergence and tested it on large and small datasets(1189 and 346 respectively). On all the numerous times I have tried, the network fails to classify the pattern.(it says the pattern belongs to all the 4 classes !!!! ) . There is nothing wrong with the program as I tested it out on the simpleclass_dataset in matlab and it works almost perfectly

Is it possible that the neural network fails to learn the function because of really poor data? (by poor data i mean that the datapoints corresponding to one sample of one class are very close to another sample belonging to a different class or something of that sort). Or can the neural network fail because of very poor feature descriptors?

Can anyone suggest a simpler method to extract features from the image(I am now shifting to MATLAB as I am now only concerned with simulations in webots and not the real robot). What sort of features can I choose? The patterns are very simple (L,an inverted L and its reflected versions are the 4 patterns)

gautam264
  • 41
  • 2
  • Hi, I did a google search and found this. http://www.mathworks.co.uk/help/nnet/examples/character-recognition.html. I suppose following that tutorial to the letter would be a suitable starting point. – QED Aug 08 '14 at 22:16
  • 1
    Hi, also check the Perceptrons here to help answer your questions on "noisey (bad) data" and "class seperabilty": http://www.mathworks.co.uk/help/nnet/examples/index.html#perceptrons both of which are fundamental problems for machine learning in general. – QED Aug 08 '14 at 22:22
  • thank you QED. I have my own matlab code for back propagation and I am trying to not use the neural network toolbox as I have to implement this in another software(webots). Also, any tips on what features I can extract from the patterns that I can feed into the network as input? – gautam264 Aug 09 '14 at 15:28
  • well there are too many hand crafted features that we use in vision, but usually the ANN folk use the raw pixels, and with only 7x7 patches there is not an awful lot you can do, other than raw intensities or local binary pattern of some sort. Perhaps there is an issue with normalization: http://www.mathworks.co.uk/help/nnet/ug/choose-neural-network-input-output-processing-functions.html . Also maybe just play with binary classification of two patterns first, to reduce the complexity a bit. – QED Aug 09 '14 at 16:20
  • TBH i usually work with SVM so not too much experience with ANN, but similar problems arise. Normalization can be a big issue; dimensionality, too small, too big; as well as those problems you mentioned. Some guides: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, http://leon.bottou.org/publications/pdf/tricks-2012.pdf – QED Aug 09 '14 at 16:20
  • As you suggested, I used just 2 output classes and the training data corresponding to those 2 classes alone. I observed the following. If I apply say 80 patterns belonging to class1 and another 60 patterns belonging to class 2, the network learns the second pattern and unlearns the first pattern. And while testing, no matter what I give at the input, the network output is for the 2 nd class (figures, as it has unlearnt the first pattern). If i randomise the order and apply it to the network, the outputs are like (0.54,0.45). which doesn make sense again. – gautam264 Aug 09 '14 at 18:31
  • i think you need to try your patterns in another system, that way you can see if it is your learning or the patterns that the problem. like say try libsvm, svmLight, or liblinear. just to see what happens. – QED Aug 09 '14 at 19:03
  • hello. i reduced the input dimensionality by changing the feature descriptors.. the network works .. probably pixel intensity values werent the way to go in the first place. Thank u QED for ur suggestions and advice :) – gautam264 Aug 10 '14 at 18:34
  • cool, perhaps you could answer your own question with your solution such that others can benefit from your findings. – QED Aug 11 '14 at 14:20
  • I cannot make a definitive and conclusive statement, that the neural network can fail to learn a function. But in the experiment that I ran, I found that if the datapoints corresponding to different classes are very close to each other, then the network fails to learn the function satisfactorily. – gautam264 Aug 12 '14 at 14:47
  • Regarding the choice of feature descriptors, I learnt that you choose a particular set of features to describe a particular pattern, depending on how different the various patterns available to classification are. Using pixel intensities as an input is a bad choice and it should be used as a last resort(that does not guarantee that it will yield a successful result). – gautam264 Aug 12 '14 at 14:51

1 Answers1

0

Neural networks CAN fail to learn a function; this is most often caused by employing a network topology which is too simple to model the necessary function. A classic example of this case is attempting to learn an XOR function using a perceptron classifier, although it can even happen in multilayer neural nets sometimes; especially for complex tasks like image recognition. See my previous answer for a rough guide on how to select neural network parameters (ignore the convolution stuff if you want, although I would highly recommened looking into convolutional neural networks if you are still having problems).

It is a possiblity that there is too little seperability between classes, although I doubt that this is the case given your current features. Is there a reason that your network needs to allow an image to be four classifications simultaneously? If not, then perhaps you could classify the input as the output with the highest activation instead of all those with high activations.

Community
  • 1
  • 1
Hungry
  • 1,645
  • 1
  • 16
  • 26