0

I have a script that processes dozens of image files (using Pillow). Recently, I've noticed that my script fails with TIF (CMYK/16) format. So I've created a test case.

images = [
    "cmyk-8.tif",
    "cmyk-16.tif",
    "rgb-8.tif",
    "rgb-16.tif",
]

for img_name in images:
    path = img_dir + "\\" + img_name
    try:
        img = Image.open(path)
    except OSError as e:
        print(e)
    else:
        print("success: " + img_name)

This produces the following output:

success: cmyk-8.tif
cannot identify image file '...\\cmyk-16.tif'
success: rgb-8.tif
success: rgb-16.tif

So the problem is definitely with the TIF (CMYK/16) format.

How can I open this specific format, or convert it to a openable(?) format (that is RGB/8, RGB/16, CMYK/8) first, and then open it?


In another QA, GDAL was suggested to solve the problem. I've tried it (install GDAL, associate it with Python, and make it work with the current script), but eventually gave up (too problematic). So I've decided just to focus on gdal_translate.

I've installed "gdal-203-1911-x64-core.msi" from GISInternals, and tried to do the conversion:

"C:\Program Files\GDAL\gdal_translate.exe" -scale -ot byte -of JPEG "C:\Users\%username%\Documents\GitHub\dump\python\tif-cmyk-16\images\cmyk-16.tif" "cmyk-16.jpg"

but it didn't work. I got incorrect conversion:

output comparison

I'm not familiar with GDAL, so I must be doing something wrong. How do I get it to do the conversion correct?

Also, this is the cmd output:

ERROR 1: Can't load requested DLL: C:\Program Files\GDAL\gdalplugins\ogr_MSSQLSpatial.dll
126: The specified module could not be found.

ERROR 1: Can't load requested DLL: C:\Program Files\GDAL\gdalplugins\ogr_MSSQLSpatial.dll
126: The specified module could not be found.

Input file size is 200, 200
0...10...20...30...40...50...60...70...80...90...100 - done.
Press any key to continue . . .

It seems something is missing, and I don't know if the inccorect conversion is related to this.

Related scripts and output files can be found here.

akinuri
  • 10,690
  • 10
  • 65
  • 102
  • Have You tried filling bug report on Pillow? Looking at the source and documentation it does not look like supported. Soooo.... libtiff? – Kamiccolo Jun 08 '18 at 12:56
  • 1
    @Kamiccolo I think it's a WIP. Martijn posted a link to this [page](https://github.com/python-pillow/Pillow/issues/1888) in his [answer](https://stackoverflow.com/a/48950889/2202732). So it seems like it's documented. – akinuri Jun 08 '18 at 13:18

1 Answers1

1

Seems like the other images are working, so I will focus on the 16 bit cmyk tif conversion to 8 bit rgb jpeg. I guess this approach will apply for other conversions too with a few tweaks.

Here are a few ways to convert it. The first two uses GDAL since you suggested that, the last approach uses libtiff, which i think is a slightly more appropriate for your use case.

GDAL from command line

From pure command line I got it working with two commands and an intermediate tif.

First convert the 16 bit tif to an 8 bit tif

gdal_translate -scale -ot byte -of GTIFF cmyk-16.tif cmyk-out.tif -co PHOTOMETRIC=CMYK

The file cmyk-out.tif is now in 8 bit. It can now be converted into jpeg with the following command

gdal_translate -of JPEG cmyk-out.tif cmyk-out.jpg

Hence you can just create a batch script chaining the two commands (and maybe deleting the intermediate tif)

GDAL (and numpy and PIL) from python

If the problem seems to be that the image can not be opened by PIL, you could use GDAL for the opening, numpy for the conversion to 8 bit and PIL for the conversion to RGB.

from osgeo import gdal
import numpy as np
from PIL import Image

def reduceDepth(image, display_min, display_max):
    image -= display_min
    image = np.floor_divide(image, (display_min - display_max + 1) / 256)
    image = image.astype(np.uint8)
    return image

raster = gdal.Open("cmyk-16.tif")
c = raster.GetRasterBand(1).ReadAsArray()
m = raster.GetRasterBand(2).ReadAsArray()
y = raster.GetRasterBand(3).ReadAsArray()
k = raster.GetRasterBand(4).ReadAsArray()

cmyk = np.dstack((c, m, y, k))
cmyk8 = reduceDepth(cmyk, 0, 65536)

im = Image.fromarray(cmyk8)
im = im.convert('RGB')
im.save("cmyk-out.jpeg")

Using libtiff instead of GDAL

You can also use libtiff to open the tif instead of gdal. Then the above script will look like this

import numpy as np
from PIL import Image
from libtiff import TIFF

input = TIFF.open("cmyk-16.tif").read_image()

def reduceDepth(image, display_min, display_max):
    image -= display_min
    image = np.floor_divide(image, (display_min - display_max + 1) / 256)
    image = image.astype(np.uint8)
    return image

v8 = reduceDepth(input, 0, 65536)

im = Image.fromarray(v8)
im = im.convert('RGB')
im.save("cmyk-out.jpeg")
the_cheff
  • 4,690
  • 3
  • 15
  • 23
  • I wouldn't want to introduce more external libraries, because things get complicated then. So I'll stick with the command line. The first conversion command you provided converts it to CMYK/8, so it's not RGB (this is not a problem). Also, the input and output images are [not the same](https://github.com/akinuri/dump/tree/master/python/tif-cmyk-16/gdal/no-python/GISInternals). I guess that's because it does the conversion without a color profile. In Pillow, I'm using color profiles (.icc, icm) for the correct conversion. Is it possible to use these with gdal_translate? – akinuri Jun 09 '18 at 12:07
  • Ok that is my mistake, i just looked at gdalinfo which tell me that the bands are R,G,B,A and not C,M,Y,K, so i expected it to be in rgb, i will update the answerr. With GDAL i guess you can use -co SOURCE_ICC_PROFILE=[YourProfile] – the_cheff Jun 09 '18 at 12:19
  • However approach 2 and 3 is using pillow to do the conversion, so they might work for you right? I am quite sure that GDAL depends on numpy, so it might not be that you need to install more external stuff. – the_cheff Jun 09 '18 at 12:37
  • I've tried gdal_translate [with a profile](https://github.com/akinuri/dump/tree/master/python/tif-cmyk-16/gdal/no-python/GISInternals), but couldn't make it work. Also tested libtiff, and it produced even weirder [output](https://github.com/akinuri/dump/tree/master/python/tif-cmyk-16/libtiff). In the previous question, someone suggested pgmagick. I [tested](https://github.com/akinuri/dump/tree/master/python/tif-cmyk-16/pgmagick-image-open-test) it yesterday, and it was able to open the problematic tif file. I need further tests, but if all fails, I'll go with it. – akinuri Jun 09 '18 at 13:04
  • Ok, i hope that you solve your problem. Just out of curiosity, can you show me how the output jpeg should look if everything was ok? [This is the jpeg i get from approach 1](https://ibb.co/mJF5L8) (Based on the folders in you Git repo it does not look like we get the same?) – the_cheff Jun 10 '18 at 04:41
  • Btw i just looked at the source icc profile, and i can see that it actually lies within the your tiff files metadata, so if you are using GDAL newer than 1.11, you should actually be using that out of the box – the_cheff Jun 10 '18 at 04:55
  • Indeed we get different outputs. Another weird thing is that when I download the output image you shared on ibb.co, it looks dfferent on my PC. See the [difference](https://ibb.co/b8aHYT). Also, just to be on the same page, I'll update the repo tomorrow with the actual tif file that I need to process, and show you the difference between conversions. Working on some random image (Cornell Box, in this case) might be misleading. I don't have access to the actual file atm, so I'm gonna have to do it tomorrow. – akinuri Jun 10 '18 at 13:04
  • Updated repo again. Here's an example [conversion](https://github.com/akinuri/dump/tree/master/python/cmyk-to-rgb) (with and without profile) in Pillow. There's a significant difference. And here's the gdal [conversion](https://github.com/akinuri/dump/tree/master/python/tif-cmyk-16/gdal/no-python/GISInternals) of the problematic tif file. I've noticed that I can't rely on Windows Photo Viewer to view the image colors correctly. If you open that three "delta" images in Photoshop and compare their historgram graph, you can see the difference. GDAL's output is warmer. – akinuri Jun 11 '18 at 08:13
  • While the difference in these images (GDAL outputs) is somewhat unnoticeable, I can't assume it'll always be like that. It might differ from image to image. I gave libtiff another go with the actual image and still its [output](https://github.com/akinuri/dump/tree/master/python/tif-cmyk-16/libtiff) is weird. – akinuri Jun 11 '18 at 08:22