4

my program in Octave uses Neural Networks to recognize handwritten digits. The problem is that it will not recognize the digit correctly if the color is changed. But if color is inverted, it predicts incorrectly. For example: Six GrayScale Image

Six GrayScale Image


The Images above contain same number with same pattern. But they have inverted colors.

I am already using RGB to GrayScale Conversions. How to overcome this problem? Is there any better option than using separate training examples for inverted colored images?

James Z
  • 12,209
  • 10
  • 24
  • 44
HN Learner
  • 544
  • 7
  • 19
  • 2
    Does your training data have data with inverted colors? CNNs will not learn representations that are invariant to color. – rafaelvalle Jun 23 '17 at 08:01
  • No I didn't include training examples for inverted color. Yes that is a solution but I was wondering if there is something better to solve this problem ! – HN Learner Jun 23 '17 at 08:06
  • 1
    One of the advantages of Neural Networks is that they can find the best "features" for the DATA AND TASK at hand. They can, for example, learn HOG descriptors. https://www.quora.com/How-do-I-use-deep-CNN-to-learn-HOG-descriptors Take a look at this IBM post about Deep Learning and feature engineering https://www.ibm.com/developerworks/community/blogs/jfp/entry/Feature_Engineering_For_Deep_Learning?lang=en – rafaelvalle Jun 23 '17 at 17:41
  • I have a similar issue. – Soerendip Jan 09 '18 at 11:52

2 Answers2

2

Edge Extraction

If you extract the edges from your images, you'll see that it is largely invariant in that regard, both versions of your image look almost identical after the transformation

Below I show how the image looks when you extract the edges using Laplacian edge detection, for both the "white on black" and "black on white" images:

enter image description here

The idea is to train your network on the edges, to gain some invariance with regard to the variation you described.

Here are some resources for MATLAB/OCtave for edge extraction:

https://mathworks.com/discovery/edge-detection.html https://octave.sourceforge.io/image/function/edge.html

I've done the edge extraction using Python and OpenCV with edges_image = cv2.Laplacian(original_image, cv2.CV_64F). I may post a MATLAB/Octave sample if I can fix my install :)

Detect dominant color and invert if required

Another way would be to decide that you want to use a version, let's say you've trained the network on "Black text on white background" variant.

Now when you input the image, first detect if the dominant color / background is black or white, then invert if required.

bakkal
  • 54,350
  • 12
  • 131
  • 107
  • Thanks for the answer Sir. The second method to "Detect Dominant Color " came to my mind but I was not sure how to invert colors using Octave/Matlab. However , the first method of "Edge Extraction" seems more interesting. Can you please describe which one of these two methods is more efficient and accurate? – HN Learner Jun 23 '17 at 09:22
  • 1
    @HNLearner I'd go with edge or any features extraction, as that's how a deep end-to-end solution would operate and they give good results, e.g. if you look at a convolutional neural network, you'll see at the first stages of it will try to learn features that look like edges. – bakkal Jun 23 '17 at 09:58
  • 1
    @HNLearner If you'd like you can tell me your network configuration, e.g. number of hidden layers and number of neurons in each layer etc It should be easy enough to test this out. Drop me an email if you wish. – bakkal Jun 23 '17 at 10:08
  • Thanks a lot for your help Sir. I am beginner in Machine Learning and just started Neural Networks some time ago. I am using only one hidden layer. – HN Learner Jun 23 '17 at 10:30
  • 1
    @HNLearner Note that edge detection algorithms usually are a combination of applying (convolution) a square Gaussian filter to blur the image and then some other filter, e.g. Laplacian or Sobel, to find the edges. https://www.youtube.com/watch?v=uihBwtPIBxM – rafaelvalle Jun 23 '17 at 17:51
  • Thanks @bakkal I will have a look :) – HN Learner Jun 23 '17 at 18:24
2

Feature Extraction

To generalise @bakkal's suggestion of using edges, one can extract many types of image features. These include edges, corners, blobs, ridges, etc.. There is actually a page on mathworks with a few examples, including number recognition using HOG features (histogram of oriented gradients).

Such techniques should work for more complex images too, because edges are not always the best features. Extracting HOG features from the two of your images using matlab's extractHOGFeatures:

enter image description here

I believe you can use vlfeat for HOG features if you have Octave instead.

Another important thing to keep in mind is that you want all images to have the same size. I have resized both of your images to be 500x500, but this is arbitrary.

The code to generate the image above

close all; clear; clc;

% reading in
img1 = rgb2gray(imread('img1.png'));
img2 = rgb2gray(imread('img2.png'));

img_size = [500 500]; % 

% all images should have the same size
img1_resized = imresize(img1, img_size);
img2_resized = imresize(img2, img_size);

% extracting features
[hog1, vis1] = extractHOGFeatures(img1_resized);
[hog2, vis2] = extractHOGFeatures(img2_resized);

% plotting
figure(1);
subplot(1, 2, 1);
plot(vis1);
subplot(1, 2, 2);
plot(vis2);

You do not have to be limited to HOG features. One can also quickly try SURF features

surf features

Again, the color inversion does not matter because the features match. But you can see that HOG features are probably a better choice here, because the plotted 20 points/blobs do not really represent number 6 that well.. The code to get the above in matlab.

% extracting SURF features
points1 = detectSURFFeatures(img1_resized);
points2 = detectSURFFeatures(img2_resized);

% plotting SURF Features
figure(2);
subplot(1, 2, 1);
imshow(img1_resized);
hold on;
plot(points1.selectStrongest(20));
hold off;
subplot(1, 2, 2);
imshow(img2_resized);
hold on;
plot(points2.selectStrongest(20));
hold off;

To summarise, depending on the problem, you can choose different types of features. Most of the time choosing raw pixel values is not good enough as you saw from your own experience, unless you have a very large dataset encapsulating all possible cases.

Vahe Tshitoyan
  • 1,439
  • 1
  • 11
  • 21