6

I found a so wired thing while converting a pdf to jpeg, so i'd like to figure out that maybe this is a small bug. See the converted jpg below, you could find that, the background color are all black. The image is here: www.shdowin.com/public/02.jpg

However, in the source file of pdf, you can see that the background color are normal white. The image is here: www.shdowin.com/public/normal.jpg

I thought this maybe my pdf file's fault, however, when i try to use Acrobat.pdf2image in .NET environment, the converted jpg shows correctly.

Here is my code:

from wand.image import Image
from wand.color import Color
import os, os.path, sys

def pdf2jpg(source_file, target_file, dest_width, dest_height):
    RESOLUTION    = 300
    ret = True
    try:
        with Image(filename=source_file, resolution=(RESOLUTION,RESOLUTION)) as img:
            img.background_color = Color('white')
            img_width = img.width
            ratio     = dest_width / img_width
            img.resize(dest_width, int(ratio * img.height))
            img.format = 'jpeg'
            img.save(filename = target_file)
    except Exception as e:
        ret = False

    return ret

if __name__ == "__main__":
    source_file = "./02.pdf"
    target_file = "./02.jpg"

    ret = pdf2jpg(source_file, target_file, 1895, 1080)

Any suggestions for the issue?

I have uploaded the pdf to the url: 02.pdf

You can try...

cendy
  • 81
  • 1
  • 3
  • What are ImageMagick and GhostScript versions you have installed? Because, it's Ghostscript that does all work in this case. Can you check if Ghostscript rasterizes your PDF with or without error? Version 9.10 works OK with your file. – user2846289 Dec 08 '13 at 09:54
  • Thanks VadimR, I tested your comments several times and I still can not render correctly in this case. My ImageMagick version is 6.8.7-4(32 bits) and my ghostScript Version is 9.10 and 9.07(both tested), they both do not work. Could you please help me to tell what is your ImageMagick version? That would be helpful..3x~~~ – cendy Dec 16 '13 at 02:57
  • IM's version is `6.7.7-10 2013-09-10 Q16`, I'm currently on Ubuntu 13.10. When you say "both do not work", do you mean `gs 02.pdf` and `convert 02.pdf 02.jpg` give you incorrect picture? – user2846289 Dec 16 '13 at 07:42
  • Hi VadimR, thanks again. What i mean "both do not work" is ghostScript 9.10 in Windows and gs 9.07 in Ubuntu13.04 both do not work in these PYTHON CODEs. I tried what you show me "convert and gs"(gs9.07), they works fine however they are not what i need you know since i DO NOT familiar with ghostscript language. Do you think it is the problem of wand? Because i need to convert hundreds of pdfs everyday, just a few of these pdfs would be like this. Could you please try these PYTHON CODEs? wand version is 0.3.5 – cendy Dec 16 '13 at 12:47
  • If IM and GS themselves both produce correct image on your system, then either the code or library (wand) give incorrect result. Sorry, I don't know and don't use Python. I came to your question because of `PDF` tag and I'm interested in unusual PDFs :). Maybe someone will help. But you can call any system command from Python, including command to convert PDF to JPG, no need in wand if it's not working. Assuming this code is a fragment of your larger Python application. – user2846289 Dec 16 '13 at 16:38
  • Hi VadimR, I just found that it's the problem of resize. If i just convert from pdf, it works well. However, if i resize first and then save jpg file, the black background jpg would be displayed. And there are so many resize filters of IM, it's a headache for me to check which would work correctly with my pdfs.. – cendy Dec 18 '13 at 09:26

3 Answers3

4

For others who still have this problem I fixed it after googling and trying a couple of hours thanks to this question https://stackoverflow.com/a/40494320/2686243 by using this two lines:

img.background_color = Color("white")
img.alpha_channel = 'remove'

Tried with Wand version 0.4.4

Martin
  • 660
  • 10
  • 23
2

I got the answer by myself. It's because of the alpha_channel case. This pdf includes some transparent background(after i transfomred to png format), and for resize, ImageMagick choose the best resize filter, so black background displayed.

So, after a lot of experiments, I found that just add "img.alpha_channel=False" in "with" statement(before img.save()), that would work properly.

Thanks for VadimR's advise, it is helpful.

cendy
  • 81
  • 1
  • 3
2

An easy solution is to change the order of commands: Change the format to jpeg first and then to resize

        img.format = 'jpeg'
        img.resize(dest_width, int(ratio * img.height))

It is also very easy to open the PDF in the exact size by the resolution tuple, because the resolution can be a float number.

hynekcer
  • 14,942
  • 6
  • 61
  • 99