3

I'm trying to turn PDF files with one or many pages into images for each page. This is very much like the question found here. In fact, I'm trying to use the code from @Idan Yacobi in that post to accomplish this. His code looks like this:

import ghostscript

def pdf2jpeg(pdf_input_path, jpeg_output_path):
    args = ["pdf2jpeg", # actual value doesn't matter
            "-dNOPAUSE",
            "-sDEVICE=jpeg",
            "-r144",
            "-sOutputFile=" + jpeg_output_path,
            pdf_input_path]
    ghostscript.Ghostscript(*args)

When I run the code I get the following output from python: ##### 238647312 c_void_p(238647312L)

When I look at the folder where the new .jpg image is supposed to be created, there is a file there with the new name. However, when I attempt to open the file, the image preview says "Windows Photo Viewer can't open this picture because the picture is being edited in another program."

It seems that for some reason Ghostscript opened the file and wrote to it, but didn't close it after it was done. Is there any way I can force that to happen? Or, am I missing something else?

I already tried changing the last line above to the code below to explicitly close ghostscript after it was done.

GS = ghostscript.Ghostscript(*args)
GS.exit()
Jed
  • 1,823
  • 4
  • 20
  • 52
  • if `GS = ghostscript.Ghostscript(*args)` throws an exception your code never actually gets to `GS.exit()` (unless you use the `finally`) –  Dec 21 '18 at 14:16

5 Answers5

6

I was having the same problem where the image files were kept open but when I looked into the ghostscript init.py file (found in the following directory: PythonDirectory\Lib\site-packages\ghostscript__init__.py), the exit method has a line commented.

The gs.exit(self._instance) line is commented by default but when you uncomment the line, the image files are being closed.

def exit(self):
    global __instance__
    if self._initialized:
        print '#####', self._instance.value, __instance__
        if __instance__:
            gs.exit(self._instance) # uncomment this line
            self._instance = None
        self._initialized = False
2

I was having this same problem while batching a large number of pdfs, and I believe I've isolated the problem to an issue with the python bindings for Ghostscript, in that like you said, the image file is not properly closed. To bypass this, I had to go to using an os system call. so given your example, the function and call would be replaced with:

os.system("gs -dNOPAUSE -sDEVICE=jpeg -r144 -sOutputFile=" + jpeg_output_path + ' ' + pdf_input_path)

You may need to change "gs" to "gswin32c" or "gswin64c" depending on your operating system. This may not be the most elegant solution, but it fixed the problem on my end.

TheCharles
  • 21
  • 3
0

My work around was actually just to install an image printer and have Python print the PDF using the image printer instead, thus creating the desired jpeg image. Here's the code I used:

import win32api
def pdf_to_jpg(pdf_path):
    """
    Turn pdf into jpg image(s) using jpg printer
    :param pdf_path:  Path of the PDF file to be converted
    """

    # print pdf to jpg using jpg printer
    tempprinter = "ImagePrinter Pro"
    printer = '"%s"' % tempprinter
    win32api.ShellExecute(0, "printto", pdf_path, printer, ".", 0)
Jed
  • 1,823
  • 4
  • 20
  • 52
0

I was having the same problem when running into a password protected PDF - ghostscript would crash and not close the PDF preventing me from deleting the PDF.

Kishan's solution was already applied for me and therefore it wouldn't help my problem.

I fixed it by importing GhostscriptError and instantiating an empty Ghostscript before a try/finally block like so:

from ghostscript import GhostscriptError
from ghostscript import Ghostscript

...
# in my decryptPDF function
GS = Ghostscript()
try:
    GS = Ghostscript(*args)
finally:
    GS.exit()

...
# in my function that runs decryptPDF function
try:
    if PDFencrypted(append_file_path):
        decryptPDF(append_file_path)
except GhostscriptError:
    remove(append_file_path)
    # more code to log and handle the skipped file
    ... 
0

For those that stumble upon this with the same problem. I looked through the python ghostscript init file and discovered the ghostscript.cleanup() function/def.

Therefore, I was able to solve the problem by adding this simple one-liner to the end of my script [or the end of the loop].

ghostscript.cleanup()

Hope it helps someone else because it frustrated me for quite a while.

Lime
  • 1