0

I know this could be done by applying a threshold and some other techniques using OpenCV but in that case different threshold value might be required for different images.

I tried to apply Encoder-Decoder Neural Network. But what I noticed that I was able to get that for images in which the size of characters were large i.e. each image had around 4 characters but as I increased the numbers or equivalently reduced the size of each character in an image the result was very bad.

I need a very accurate output so that further each character could be fed to the recognition model.

Actual Input image

Gray Scale as input image to the Neural Network

Target/Ground Truth(output image)

 def encoder_decoder(input_shape):
        return Sequential([
            Lambda(lambda x: x / 127.5 - 1.0, input_shape=input_shape),
            Convolution2D(8, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(8, 3, 3, activation='relu', border_mode='same'),
            MaxPooling2D((2,2), strides=(2,2)),
            Dropout(0.2),
            Convolution2D(16, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(16, 3, 3, activation='relu', border_mode='same'),
            MaxPooling2D((2,2), strides=(2,2)),
            Dropout(0.2),
            Convolution2D(32, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(32, 3, 3, activation='relu', border_mode='same'),
            MaxPooling2D((2,2), strides=(2,2)),
            Dropout(0.2),
            Convolution2D(64, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(64, 3, 3, activation='relu', border_mode='same'),
            MaxPooling2D((2,2), strides=(2,2)),
            Dropout(0.2),
            Convolution2D(128, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(128, 3, 3, activation='relu', border_mode='same'),
            UpSampling2D(size=(2,2)),
            Dropout(0.2),
            Convolution2D(64, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(64, 3, 3, activation='relu', border_mode='same'),
            UpSampling2D(size=(2,2)),
            Dropout(0.2),
            Convolution2D(32, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(32, 3, 3, activation='relu', border_mode='same'),
            UpSampling2D(size=(2,2)),
            Dropout(0.2),
            Convolution2D(16, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(16, 3, 3, activation='relu', border_mode='same'),
            UpSampling2D(size=(2,2)),
            Dropout(0.2),
            Convolution2D(8, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(8, 3, 3, activation='relu', border_mode='same'),
            Convolution2D(1, 1, 1, activation='relu')
        ])

The background contains the lines as well(to be removed as well) which are not exactly the same in different images. Also they aren't exactly parallel to apply FFT. The handwriting is obtained from different person.

Trained the Neural Network on small a set of images(70 to 100) to test until it overfits. The input images are first converted to grey scale, i.e. the whole process takes place in grey scale. Size of an image is (800X600) as further I need to draw contours and apply classification algorithm I cannot obtain smaller output.

Earlier I trained(ignore the background here being '0' and in next example '255' for output) on clear images containing less number of characters in an image:

Input image(grey scale)

Obtained Output

Later, on this kind of image:

INPUT IMAGE

INPUT IMAGE to NN(grey scale)

OUTPUT OF NN

Is it possible to reconstruct such small size characters/digits?

How should I improve its performance? Should there be an improvement in the architecture, the size of data set,resolution or something else.

And if there are some other better approaches I would be glad to know.

zen_ksri
  • 1
  • 1
  • https://stackoverflow.com/questions/50792812/how-to-remove-watermark-background-in-image-python – Joe May 06 '20 at 11:16

1 Answers1

0

You need data to train your network, before and after images, to train the network on what you expect as a result.

That's why you need to "apply a threshold and some other techniques using OpenCV".

One way is to use techniques for removeing watermarks from images as described [here](Removing watermark out of an image using OpenCV ) or here.

Remove the background and train the network with the images.

Another approach is to create synthetic images to train the network on. One advantage here is that you have the resulting image beforehand.

  • Take an empty background
  • Use some font to draw on it (after, synthetic image)
  • Draw the same thing on a white background (before)

Train the network with these images. Another advantage in using this approach is that you could train your network on different fonts, font sizes and backgrounds.

Joe
  • 6,758
  • 2
  • 26
  • 47
  • Thank you so much @Joe. This will certainly save my effort and time(especially the alternative approach). Sorry if I did not make you understand what exactly I was asking. I have edited the question which was specific about the faults in the training part. – zen_ksri May 06 '20 at 17:57