0

Well I am not able to get good accuracy of text detection in tesseract. Please check code and image below.

       Mat imgInput = CvInvoke.Imread(@"D:\workspace\raw2\IMG_20200625_194541.jpg", ImreadModes.AnyColor);
      
      int kernel_size = 11;

           //Dilation

            Mat imgDilatedEdges = new Mat();
            CvInvoke.Dilate(
                                      imgInput,
                                      imgDilatedEdges,
                                      CvInvoke.GetStructuringElement(
                                       ElementShape.Rectangle,
                                           new Size(kernel_size, kernel_size),
                                          new Point(1, 1)),
                                          new Point(1, 1),
                                            1,
                                          BorderType.Default,
                                            new MCvScalar(0));
            //Blur
            Mat imgBlur = new Mat();
            CvInvoke.MedianBlur(imgDilatedEdges, imgBlur, kernel_size);

            //Abs diff
            Mat imgAbsDiff = new Mat();
            CvInvoke.AbsDiff(imgInput, imgBlur, imgAbsDiff);


            Mat imgNorm = imgAbsDiff;


            //Normalize
            CvInvoke.Normalize(imgAbsDiff, imgNorm, 0, 255, NormType.MinMax, DepthType.Default);

         
            
            Mat imgThreshhold = new Mat();
            //getting threshhold value
            double thresholdval = CvInvoke.Threshold(imgAbsDiff, imgThreshhold, 230, 0, ThresholdType.Trunc);

           

            //Normalize
            CvInvoke.Normalize(imgThreshhold, imgThreshhold, 0, 255, NormType.MinMax, DepthType.Default);
            imgThreshhold.Save(@"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
      

            //contrast correction
            Mat lab = new Mat();
            CvInvoke.CvtColor(imgThreshhold, lab, ColorConversion.Bgr2Lab);
            VectorOfMat colorChannelB = new VectorOfMat();
            CvInvoke.Split(lab, colorChannelB);

            CvInvoke.CLAHE(colorChannelB[0], 3.0, new Size(12, 12), colorChannelB[0]);


            Mat clahe = new Mat();

            //merge
            CvInvoke.Merge(colorChannelB, clahe);
         

            Image<Bgr, byte> output = new Image<Bgr, byte>(@"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
            Bitmap bmp = output.ToBitmap();

            //setting image to 300 dpi since tesseract likes that
            bmp.SetResolution(300, 300);
            bmp.Save(@"D:\workspace\ocr_images\IMG_20200625_194541.jpg");

I am not getting expected accuracy. Please check how image is converted.

source image enter image description here

converted imageenter image description here

enter image description here

enter image description here

I have posted few images above that you can refer. For first image i am getting garbage data. For last two images i am getting partial data. Converting image to gray scale and playing with threshold gives better output.

I want to understand that if in case threshold is the key part then how i will be able to get dynamic threshhold value for each new image? It is going to work as service so user will simply pass the image and get the result. My app should be intelligent enough to process and understand image.

Do i have to adjust contrast, threshold more accurately? If yes how i will do that? or image itself is faulty I mean noise causing problem.

Please let me know what i am doing wrong in the algorithm or anything which will help me to understand issue. Any one who is aware of please tell me what should be ideal steps for image preprocessing for OCR?

I am using csharp, emucv and tesseract. Any suggestion will be highly appreciated.

manthan
  • 102
  • 1
  • 15
  • For dynamic threshold values, you can try `cv2.adaptiveThreshold` instead of simple binary threshold. – ZdaR Jul 28 '20 at 05:00
  • When crop some part of 1st image its working fine, but couldnt understand why not working at all. The 2nd and 3rd images also working %80 accuracy – Yunus Temurlenk Jul 28 '20 at 12:17
  • have u used my code? Please suggest which algorithm you have used. – manthan Jul 28 '20 at 13:53
  • @ Yunus Temurlenk have u used my code? Please suggest which algorithm you have used. – manthan Jul 28 '20 at 14:10
  • @ZdaR I have used cv2.adaptiveThreshold but not giving the proper output. I want to know how to determine dynamically block size and param1 in cv2.adaptiveThreshold? since user is not going to play with values all the time to get result. – manthan Jul 28 '20 at 15:15
  • Actually you shouldn't play around with the block size much, just resize all images to say 1000 pixel width, preserving the aspect ratio and a single block size should work for all of them. Obviously you need to tune the block size, but once all the images are normalized(resized), you can easily arrive at a single block value which fits for most of the cases. – ZdaR Jul 29 '20 at 04:30
  • Also you can take some learning from this [thread](https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy) – ZdaR Jul 29 '20 at 04:32
  • @Zdar can you please tell me any changes are required in my algo? – manthan Jul 29 '20 at 06:07

0 Answers0