character reconstruction and filling for OCR

Question

I am working with text recognition on tires. In order to use an OCR, I must first get a clear binary map.

I have processed images and the text appears with broken and discontinued edges. I have tried standard erosion/dilation with circular discs and line element in MATLAB, but it does not really help.

Pr1- Any ideas on how to reconstruct these characters and fill the gap in between the strokes of characters?

Original Image_highres Original Image_lowRes canny edge detected

Pr2- The images above are higher resolution and under good illumination. However, if the illumination is poor and resolution is comparatively low as in the image below, what would be the viable options for processing?

Solutions tried:

S1: This is the result of application of median filter to the processed image shared by Spektre. To remove noise I applied a median filter (5x5) and subsequently image dilation with a line element (5,11). Even now the OCR (Matlab 2014b) can only recognize some of the characters

Anyway, thanks a lot for suggestions so far. I will still wait to see if someone can suggest something different perhaps thinking out of the box :).

Results of Matlab implementation of the steps from Spektre's code below (without stroke dilation (normalization with corners in order of 1,2,3,4:

and with threshold tr0=400 and tr1=180 and corner order for normalization 1,3,2,4

Best Regards

Wajahat

add the source image without filtering ... it is possible that you filtered out too much information — Spektre, Jul 16 '15 at 14:10

Spektre · Accepted Answer · 2020-10-15T07:58:48.463

I have played a bit with your input

Normalization of lighting + dynamic range normalization helps a bit to obtain much better results but still far away from needed one. I would like to try sharpening of partial derivations to boost the letters from background and treshold out small bumps before integrate back and recolor to mask image when I will have the time (not sure when maybe tomorow) I will edit this (and comment/notify you)

normalized lighting

compute average corners intensity and bilinear-ly rescale the intensities to match average color

normalized lighting

if you need something more sophisticated see:

OpenCV for OCR: How to compute thresholding levels for gray image OCR

edge detection

partial derivation of intensity i by x and y...

i=|i(x,y)/dx|+|i(x,y)/dy|

and then tresholded by treshold=13

edge detect

[notes]

To eliminate most noise I applied smooth filtering before edge detection

[edit1] after some analysis I found your image has poor edges for sharpening integration

Here example of intensity graph after first derivation by x in the middle line of image

poor edges

As you can see the black areas are fine but the white-ish ones are almost non recognizable from background noise. So your only hope is to use the min max filtering as @Daniel answer suggested and take more weight on black edge regions (white are not reliable)

min max

min max filter emphasize the black (blue mask) and white (red mask) regions. If booth areas would be reliable then you just fill the space between them but that is not an option in your case instead I would enlarge the areas (weighted more on blue mask) and OCR the result with OCR customized for such 3 color input.

You can make your own custom OCR for this see OCR and character similarity

you could also take 2 images with different light position and fixed camera and combine them to cover the recognizable black area from all sides

[edit2] C++ source code for the last method

//---------------------------------------------------------------------------
typedef union { int dd; short int dw[2]; byte db[4]; } color;
picture pic0,pic1,pic2; // pic0 source image,pic1 normalized+min/max,pic2 enlarge filter
//---------------------------------------------------------------------------
void filter()
    {
    int sz=16;          // [pixels] square size for corner avg color computation (c00..c11)
    int fs0=5;          // blue [pixels] font thickness
    int fs1=2;          // red  [pixels] font thickness
    int tr0=320;        // blue min treshold
    int tr1=125;        // red  max treshold

    int x,y,c,cavg,cmin,cmax;
    pic1=pic0;          // copy source image
    pic1.rgb2i();       // convert to grayscale intensity

    for (x=0;x<5;x++) pic1.ui_smooth();
    cavg=pic1.ui_normalize();

    // min max filter
    cmin=pic1.p[0][0].dd; cmax=cmin;
    for (y=0;y<pic1.ys;y++)
     for (x=0;x<pic1.xs;x++)
        {
        c=pic1.p[y][x].dd;
        if (cmin>c) cmin=c;
        if (cmax<c) cmax=c;
        }
    // treshold min/max
    for (y=0;y<pic1.ys;y++)
     for (x=0;x<pic1.xs;x++)
        {
        c=pic1.p[y][x].dd;
             if (cmax-c<tr1) c=0x00FF0000; // red
        else if (c-cmin<tr0) c=0x000000FF; // blue
        else                 c=0x00000000; // black
        pic1.p[y][x].dd=c;
        }
    pic1.rgb_smooth();  // remove single dots

    // recolor image
    pic2=pic1; pic2.clear(0);
    pic2.bmp->Canvas->Pen  ->Color=clWhite;
    pic2.bmp->Canvas->Brush->Color=clWhite;
    for (y=0;y<pic1.ys;y++)
     for (x=0;x<pic1.xs;x++)
        {
        c=pic1.p[y][x].dd;
        if (c==0x00FF0000)
            {
            pic2.bmp->Canvas->Pen  ->Color=clRed;
            pic2.bmp->Canvas->Brush->Color=clRed;
            pic2.bmp->Canvas->Ellipse(x-fs1,y-fs1,x+fs1,y+fs1); // red
            }
        if (c==0x000000FF)
            {
            pic2.bmp->Canvas->Pen  ->Color=clBlue;
            pic2.bmp->Canvas->Brush->Color=clBlue;
            pic2.bmp->Canvas->Ellipse(x-fs0,y-fs0,x+fs0,y+fs0); // blue
            }
        }
    }
//---------------------------------------------------------------------------
int  picture::ui_normalize(int sz=32)
    {
    if (xs<sz) return 0;
    if (ys<sz) return 0;
    int x,y,c,c0,c1,c00,c01,c10,c11,cavg;

    // compute average intensity in corners
    for (c00=0,y=         0;y<     sz;y++) for (x=         0;x<     sz;x++) c00+=p[y][x].dd; c00/=sz*sz;
    for (c01=0,y=         0;y<     sz;y++) for (x=xs-sz;x<xs;x++) c01+=p[y][x].dd; c01/=sz*sz;
    for (c10=0,y=ys-sz;y<ys;y++) for (x=         0;x<     sz;x++) c10+=p[y][x].dd; c10/=sz*sz;
    for (c11=0,y=ys-sz;y<ys;y++) for (x=xs-sz;x<xs;x++) c11+=p[y][x].dd; c11/=sz*sz;
    cavg=(c00+c01+c10+c11)/4;

    // normalize lighting conditions
    for (y=0;y<ys;y++)
     for (x=0;x<xs;x++)
        {
        // avg color = bilinear interpolation of corners colors
        c0=c00+(((c01-c00)*x)/xs);
        c1=c10+(((c11-c10)*x)/xs);
        c =c0 +(((c1 -c0 )*y)/ys);
        // scale to avg color
        if (c) p[y][x].dd=(p[y][x].dd*cavg)/c;
        }
    // compute min max intensities
    for (c0=0,c1=0,y=0;y<ys;y++)
     for (x=0;x<xs;x++)
        {
        c=p[y][x].dd;
        if (c0>c) c0=c;
        if (c1<c) c1=c;
        }
    // maximize dynamic range <0,765>
    for (y=0;y<ys;y++)
     for (x=0;x<xs;x++)
      c=((p[y][x].dd-c0)*765)/(c1-c0);
    return cavg;
    }
//---------------------------------------------------------------------------
void picture::rgb_smooth()
    {
    color   *q0,*q1;
    int     x,y,i;
    color   c0,c1,c2;
    if ((xs<2)||(ys<2)) return;
    for (y=0;y<ys-1;y++)
        {
        q0=p[y  ];
        q1=p[y+1];
        for (x=0;x<xs-1;x++)
            {
            c0=q0[x];
            c1=q0[x+1];
            c2=q1[x];
            for (i=0;i<4;i++) q0[x].db[i]=WORD((WORD(c0.db[i])+WORD(c0.db[i])+WORD(c1.db[i])+WORD(c2.db[i]))>>2);
            }
        }
    }
//---------------------------------------------------------------------------

I use my own picture class for images so some members are:

xs,ys size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) - clears entire image
resize(xs,ys) - resizes image to new resolution
bmp - VCL encapsulated GDI Bitmap with Canvas access

I added source just for 2 relevant member functions (no need to copy whole class here)

[edit3] LQ image

The best setting I found (code is the same):

int sz=32;          // [pixels] square size for corner avg color computation (c00..c11)
int fs0=2;          // blue [pixels] font thickness
int fs1=2;          // red  [pixels] font thickness
int tr0=52;         // blue min treshold
int tr1=0;          // red  max treshold

Due to lighting conditions the red area is unusable (turned off)

Thanks a lot for the response. I have such results for high resolution images, but it is still not sufficient for OCRs. In order to use OCR, I need to know of any way of only filling the gaps in the stroke width of each character without adjacent character merging into each other. I have applied dilation on your processed image with circular disk size of 5 pixels or a line element, but I am still unable to get a binary map good enough for OCR. It does recognize characters but most of them are wrong.Best Regards — Wajahat, Jul 22 '15 at 15:01
@Wajahat I do not think you want that because the shadows are the only recognizable feature in your image I can see... As mentioned in last edit the best thing would be to have 2 or 4 light positions (corners) and take image of the same tire (fixed camera) for each light position separately. Then extract the shadows and combine all images together (merging shadows) that should outline the characters more reliably — Spektre, Aug 08 '15 at 11:26
Thanks for the response. I am using more than 2 light directions, but if the direction is more slanting (highly oblique angles) then the shadows become very significant introducing artifacts in the edge map. I believe some kind of shadow removal algorithm may help. — Wajahat, Aug 09 '15 at 15:45
@Wajahat I would use just single light direction per image ... and selecting the edge of shadow that is in contact with character. then after the join you will obtain the character edges not the shadow itself — Spektre, Aug 09 '15 at 15:50
Yes, I take one light direction per image. I am trying with min/max filtering but could not get results similar to yours above. Can you please share your piece of code for normalization and min/max filtering/thresholding? — Wajahat, Aug 10 '15 at 17:11
Thanks a lot for the code. I have implemented your steps in Matlab and have now got results similar to yours but with different thresholds than 320 and 125 (in my case they are tr0=400 and tr1=180). Can you please tell me how did you select the thresholds? Furthermore, the normalization seems very sensitive. Even if I change the ordering of corners from 1,2,3,4 to 1,3,2,4, the results with same thresholds are slightly different. Please check the images below. Looking forward to your feedback :) — Wajahat, Aug 11 '15 at 17:28
@Wajahat 1. I choose tresholds manually (keyboard+mouse wheel) and stop on nicest result. If you need to automate this then you need to add adaptive tresholding which is a bit more complicated. The value of tresholds can be different because of: different color scale in image (different RGB to Grayscale conversion) and different smooth level etc . 2. You should not change the order of corners inside normalization it would invalidate the bilinear interpolation equations !!! what do you mean by `normalization seems very sensitive` ? it should just equalize the average color on the whole image — Spektre, Aug 12 '15 at 06:35
Sorry for the vague explanation. By sensitive I meant that if the order of the corners is changed, it changes the normalization. But you answered it by saying that it will invalidate the bilinear interpolation equations. So I got my answer. Thanks — Wajahat, Aug 12 '15 at 07:54
The image used so far was a high resolution one. Do you have ideas for processing low resolution images such as the one I added below? Can you please check your implementation of normalization/min-max filtering on this image to see if any combination of thresholds produce good results? — Wajahat, Aug 12 '15 at 15:46
@Wajahat do not add answers with additional info... edit your question instead and add the info there ... (I add tag [edit1,2,3,4...] to mark change in the text,,,) I do not have time for this right now ... when I will I comment you — Spektre, Aug 12 '15 at 21:47

score 2 · Answer 2 · answered Jul 16 '15 at 12:36

You could apply first a max-filter (assign to each pixel in a new image the maximum value from a neighborhood around the same pixel in the original image), then a min-filter (assign minimum from neighborhood in max-image). Especially if you shape the neighborhood a bit wider than it is high (say, 2 or 3 pixels to the right/left, 1 pixel top/bottom), you should be able to get some of your characters (your image appears to mainly show gaps in the horizontal direction).

Optimal neighborhood size and shape depend on your specific problem, so you'll have to experiment some. You might experience glueing characters together by this operation - you'll possibly have to detect the blobs and split them if they're too wide compared to the other blobs.

edit: Also, binarization settings are absolutely key. Try several different binarization algorithms (Otsu, Sauvola, ...) to see which one (and which parameters) works best for you.

Hi Daniel , Thanks a lot for the suggestion. But how is the max/min filtering different than standard erosion and dilation? — Wajahat, Jul 21 '15 at 13:45
I think it's just different names for the same filter. The lingo in my company seems to prefer max/min (shorter...?). — Daniel, Jul 29 '15 at 06:06

character reconstruction and filling for OCR

Solutions tried:

2 Answers2

Linked