4

I am trying to detect if a g4 compressed Tiff image will produce a good OCR output. Currently, dithered Tiff's yield poor OCR results. Therefore, before I send a Tiff to the OCR engine, I would like to determine if the image is dithered. If a Tiff was dithered, Ghostscript was used to perform the dithering.

Is there an algorithm to determine if an image is dithered?

Sid
  • 465
  • 6
  • 14
Britt
  • 539
  • 1
  • 7
  • 21
  • 1
    an image being dithered is not a discrete thing; most probably, you need to speak about *level of certainty of an image being dithered*; ironically, I expect an OCR engine to provide the value. – bohdan_trotsenko Aug 17 '15 at 19:23
  • I don't understand your comment. I have many tffg4 that were converted from ghostscript. Those tiffg4 are the input to PrimeOCR. If the tiffg4 are dithered, i get poor results. Therefore, I would like to detect, before the OCR engine, if the tiffg4 is dithered. – Britt Aug 17 '15 at 19:44
  • 1
    One of the things that dithering (and speckles/noise) do to a TIFF G4 is cause poor compression. Something to provide a decent level of certainty is to check the compression ratio. A well compressed TIFF G4 is usually between 10 and 20 to 1. A possible test is to see if the compression ratio is < 10:1 – BitBank Aug 18 '15 at 10:13
  • 1
    Interesting Q/A: http://stackoverflow.com/questions/8960462/how-can-i-measure-image-noise – Harald K Aug 20 '15 at 08:09

0 Answers0