7

I'm having trouble converting a single page pdf (CMYK) to a jpg (RGB). When I use the code below, the colors in the jpg image are garish. I've tried reading through the Wand docs, but haven't found anything to simply replicate the original image. The official ImageMagick docs themselves are still rather opaque to me. For my situation, it is necessary to do this within a python script.

Below is the relevant code snippet.

with Image(filename = HOME + outFileName + ".pdf", resolution = 90) as original:
    original.format = "jpeg"
    original.crop(width=500, height=500, gravity="center")
    original.save(filename = HOME + outFileName + ".jpg")

How can I accurately convert from CMYK to RGB?

UPDATE: Here are links to a sample pdf and its converted output.

Original PDF

Converted to JPG

Christopher Perry
  • 149
  • 1
  • 2
  • 8
  • Sorry, I don't know Wand. And even doing this directly in ImageMagick looks a bit tricky, as [this question](http://stackoverflow.com/q/18243340/4014959) shows. – PM 2Ring Oct 13 '15 at 11:44
  • 2
    I don't have many CMYK PDFs lying around to test on, but can you try the following at the commandline maybe `convert someCMYK.pdf a.jpg` and also `convert someCMYK.pdf -colorspace sRGB b.jpg` and also `convert someCMYK.pdf -negate c.jpg` and see if any of [abc].jpg look good to you? – Mark Setchell Oct 13 '15 at 12:19
  • Can you post what "garish" you're experiencing vs expecting? Using the same code posted with test PDF from [ocp.de](http://www.ocp.de/support/test-charts) yields correct CMYK to RGB conversion. – emcconville Oct 13 '15 at 15:42
  • I updated the post to include samples. I unfortunately the code posted above didn't produce anything different from the sample conversion, other than one inverting the colors. – Christopher Perry Oct 16 '15 at 04:16

2 Answers2

11

This script will convert image to RGB and save it in-place if it detects that the image is in CMYK mode:

from PIL import Image
image = Image.open(path_to_image)
if image.mode == 'CMYK':
    image = image.convert('RGB')
Temak
  • 2,929
  • 36
  • 49
0

Finally I solved this problem. A CMYK mode JPG image that contained in PDF must be invert.

But in PIL, invert of CMYK mode image is not supported. Than I solve it using numpy.

Full source is in below link. https://github.com/Gaia3D/pdfImageExtractor/blob/master/extrectImage.py

See line 166~170.

imgData = np.frombuffer(img.tobytes(), dtype='B')
invData = np.full(imgData.shape, 255, dtype='B')
invData -= imgData
img = Image.frombytes(img.mode, img.size, invData.tobytes())
img.save(outFileName + ".jpg")
jwpfox
  • 5,124
  • 11
  • 45
  • 42