Recognize Non-Digits

Question

I've programmed a neural network for recognizing single digits which are pushed to my server. I worked pretty well, until customers started to push "empty digits". First I started just iterating manually over them and checking for NOT WHITE. Now it gets even more complicated, as "dirty" blanks are uploaded which have some noise in them. Addiotional some people started to push diagonal & horizontal lines or X's instead of writing 0 (zero).

I wonder how I am supposed to train a "pre"-neural network which classifies these "not digits" Especially I struggle to find a way of training the zeroes depicted by a noisy blank.

score 1 · Accepted Answer · edited May 23 '17 at 12:31

You're using a neural network, what I'd suggest is make the neural network output the probability of the input being a given class, e.g. it may output digit 5, certainty 75%.

Once you have those probabilities, you can work on finding a "cutoff" value below which you'd consider the input as being just noise/empty.

_{I've linked above to a question about getting the classification probabilities out of a NN.}

score 0 · Answer 2 · answered Dec 26 '15 at 20:39

0

You may create a training set of the bad inputs and train a network with one additional class for those bad examples.

answered Dec 26 '15 at 20:39

Amir

10,600
9
48
75

I thought about this, yet I cannot think about a solution for training a network for recognizing noisy "empty" boxes. How do I get the data? – Nex Dec 27 '15 at 21:30
Well I believe in such cases you should use heuristics. The problem space is so huge that you cannot have equivalent number of classes corresponding to the combinations of possibilities. One heuristic for the empty boxes could be to sum up the number of pixel values and ignore the input if the sum is below a threshold. The threshold could be determined through some empirical analysis on the "real" data. – Amir Dec 27 '15 at 21:39

Recognize Non-Digits

2 Answers2