21

I need to increase the dpi of my image before reading with ocr in opencv. The problems are :

  1. I do not know the dpi of my image right now
  2. I do not know how to increase the dpi of an image

I searched in Google, and almost every answer suggests using cv2.resize

image = cv2.imread("source.png")
resized_image = cv2.resize(image, (100, 50)) #I need to change it to 300 DPI

resize only changes the size of image, but after all does not increase the dpi. I tried to use it, and then checked in Photoshop, the dpi was not changed.

How to do it with opencv?

I need to change dpi to 300, why do I need to know current dpi? Because if it is already dpi > 300, I do not need to convert it.

I do it with python.

Nic3500
  • 8,144
  • 10
  • 29
  • 40
yozawiratama
  • 4,209
  • 12
  • 58
  • 106
  • Possible duplicate of [Change dpi of an image in OpenCV](https://stackoverflow.com/questions/10860969/change-dpi-of-an-image-in-opencv) – Sraw May 24 '18 at 08:20

2 Answers2

7

The dpi is just a number in the JPEG/TIFF/PNG header. It is entirely irrelevant to the world and his dog until you print the image and then it determines how large the print will be given the image's dimensions in pixels.

During image processing, it is irrelevant. The only thing of any interest is the number of pixels you have. That is the ultimate determinant of image quality, or information content - however you want to describe it.

I don't believe you can set it with OpenCV. You can certainly set it with ImageMagick like this in the Terminal:

mogrify -set density 300 *.png           # v6 ImageMagick
magick mogrify -set density 300 *.png    # v7 ImageMagick

You can check it with:

identify -format "Density: %x x %y" SomeImage.jpg    # v6 ImageMagick
magick identify -format ... as above                 # v7 ImageMagick

You can do similar things with exiftool in Terminal - note that exiftool is MUCH smaller and easier to maintain than ImageMagick because it is "just" a (very capable) single Perl script:

Extract image resolution from EXIF IFD1 information:

exiftool -IFD1:XResolution -IFD1:YResolution image.jpg

Extract all tags with names containing the word "Resolution" from an image|:

exiftool '-*resolution*' image.jpg

Set X/Y Resolution (density) on image.jpg:

exiftool -xresolution=300 -yresolution=300 image.jpg

Here is a little demonstration of what I mean at the beginning of my answer...

Use ImageMagick to create an image 1024x768 with no dpi information:

convert -size 1024x768 xc:black image.jpg

Now examine it:

identify -verbose image.jpg

Image: image.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Mime type: image/jpeg
  Class: PseudoClass
  Geometry: 1024x768+0+0
  Units: Undefined
  Colorspace: Gray
  Type: Bilevel
  ...
  ...

Now change the dpi and set the dpi units and examine it again:

mogrify -set density 300 -units pixelsperinch image.jpg   # Change dpi

identify -verbose image.jpg                               # Examine

Image: image.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Mime type: image/jpeg
  Class: PseudoClass
  Geometry: 1024x768+0+0            <--- Number of pixels is unchanged
  Resolution: 300x300               <---
  Print size: 3.41333x2.56          <--- Print size is now known
  Units: PixelsPerInch              <---
  Colorspace: Gray
  Type: Bilevel
  ...
  ...

And now you can see that suddenly we know how big a print will come out and that the number of pixels has not changed.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • can I use open cv and imagemagick in the same code? I mean i load and proccess image with opencv, and after that change dpi using image magick – yozawiratama May 24 '18 at 08:25
  • Sure, you could just use `system()` in Python with the `mogrify -set density 300 XYZ.png` after you have finished processing and saving your images. There may be other libraries in Python - I am just checking - maybe `exiv2` that is more natural to use. – Mark Setchell May 24 '18 at 08:26
  • I have not used the Python bindings, but maybe `exiv2` https://python3-exiv2.readthedocs.io/en/latest/tutorial.html?highlight=resolution – Mark Setchell May 24 '18 at 08:28
  • 7
    `It is entirely irrelevant to the world and his dog until you print the image` - Are you kidding ? Try to look at same image in front of PC monitor and cinema screen. Hope that you'll find the differences. DPI is important in _every_ graphical device, including screens. – Agnius Vasiliauskas May 24 '18 at 08:35
  • Thanks for your help, but i need to proccess it without call image file, so after `cv2.imread("image.png")` I can change the resolution in code or in memory without save. if `mogrify -set density 300 XYZ.png` need execute an image file, not opencv image data – yozawiratama May 24 '18 at 08:35
  • @AgniusVasiliauskas Ok, I should maybe have said *output* rather than *print*. By the same token, maybe you could try looking at a 16px by 8px image with the density set to 72dpi and the density set to 600dpi and you will equally notice there is no more information as a result of changing the dpi. – Mark Setchell May 24 '18 at 08:40
  • **OpenCV** doesn't store the dpi when it loads an image - it is not interesting. How/where do you think OpenCV is passing the dpi to OCR? – Mark Setchell May 24 '18 at 08:43
  • @MarkSetchell Agree that passing same image to different DPI devices wins no additional information. But ... looking at 16px by 8px image in front of cinema screen will reveal pixelation effect more strongly, because cinema screen's pixels are bigger in terms of inches (related with inverse measure of DPI - "inches per dot") – Agnius Vasiliauskas May 24 '18 at 09:00
  • @AgniusVasiliauskas Likewise agreed :-) – Mark Setchell May 24 '18 at 09:01
  • Please show your code - specifically how/where you pass an image from OpenCV to whatever OCR code you are using. Do you call `imencode()` or something that makes your OpenCV `Mat` into an image for your OCR? If so, maybe you can encode it into `NetPBM/PPM/PNM` format which doesn't include a dpi and your OCR may be happier with it because it can't check ;-) – Mark Setchell May 24 '18 at 09:09
  • You need to know the DPI to convert the pixels into real life measurements (inches, cms). – Sahil Sep 10 '20 at 05:10
3

Even though this is an old post I just wanted to say that Tesseract has been tested and found to operate better when the height of the characters is around 30 pixels. Please check the following link:

  • 4
    The article you link to explicitly says *"The OCR error rate was most strongly correlated to the height of a capital letter in pixels, regardless of dpi or point size"*, and also *" `xres` and `yres` values in the PIX structure passed to Tesseract have virtually no impact"*... thereby confirming that the DPI does not matter, just that the height of the characters should be around 30 pixels - **regardless of the dpi**. – Mark Setchell Aug 20 '19 at 12:07
  • I am using CV2 to preprocess images for OCR, and I to need somehow to convert the effective DPI of an image. If I have a piece of paper that has a capital letter that is 1/4", and I scan it on a scanner at 200DPI - I will get 50 pixels for my capital letter. I need to change the image so I get max of 30 for the letter. So when we say we want to change the DPI in the image, we are really saying we want to cut the number of pixels by a Percentage so that our OCR is better. So how do we do that with CV2? – Doug Bower Apr 10 '22 at 07:19