2

I'm trying to convert some pdf files to jpg through Wand in Python:

from wand.image import Image as Img
from wand.color import Color

    def importPdf(self):
        filename, _ = QtWidgets.QFileDialog.getOpenFileName(self, "Open File",
                                                            QtCore.QDir.currentPath())
        print(filename)
        if not filename:
            print('error')
            return
        with Img(filename=filename,format='jpeg', resolution=300) as image:
            image.compression_quality = 99
            image.save(filename='file.jpeg')
            self.open_picture()

My problem is that it results is a black screeen. The conversion works fine with png, but I cannot perform the OCR (via tesseract on the png). I think it comes from a kind of transparent layer, but I have not found the way to remove it, though I did several things such as

image.alpha_channel = False # made the same with True
image.background_color = Color('White')

before saving the file. I'm using Imagemagick V6.9, because V7 fails with Wand.

fmw42
  • 46,825
  • 10
  • 62
  • 80
Sylvain Page
  • 581
  • 1
  • 7
  • 21
  • 3
    JPG does not support transparency. So if you have it in your input file, it will be removed when saving to JPG. If you must have JPG output, then you need to flatten your result against some color before saving to JPG. background_color is a setting. It has no operator to apply it. You must add the background color and then use the equivalent of flatten to apply the background color – fmw42 Sep 29 '17 at 16:47

2 Answers2

4

I had the same problem and fixed it, check my answer here: https://stackoverflow.com/a/46612049/2686243

Adding

image.background_color = Color("white")
image.alpha_channel = 'remove'

solved the issue.

Martin
  • 660
  • 10
  • 23
0

Because I did not find -flatten via wand api, I finally did it via os.system + convert.exe of imagemagick. It does the job.

cmd = "convert -units PixelsPerInch -density 300 -background white -flatten " + filename + " converted_pdf.jpg"
        print(cmd)
        os.system(cmd)
Sylvain Page
  • 581
  • 1
  • 7
  • 21