3

I have reports that I am sending to a system that requires the reports be in a readable PDF format. I tried all of the free libraries and applications and the only one that I found worked was Adobe's acrobat family.

I wrote a quick script in python that uses the win32api to print a pdf to my printer with the default registered application (Acrobat Reader 9) then to kill the task upon completion since acrobat likes to leave the window open when called from the command line.

I compiled it into an executable and pass in the values through the command line (for example printer.exe %OUTFILE% %PRINTER%) this is then called within a batch file

import os,sys,win32api,win32print,time

# Command Line Arguments.  
pdf = sys.argv[1]
tempprinter = sys.argv[2]

# Get Current Default Printer.  
currentprinter = win32print.GetDefaultPrinter()
# Set Default printer to printer passed through command line.  
win32print.SetDefaultPrinter(tempprinter)
# Print PDF using default application, AcroRd32.exe
win32api.ShellExecute(0, "print", pdf, None, ".", 0)
# Reset Default Printer to saved value
win32print.SetDefaultPrinter(currentprinter)
# Timer for application close
time.sleep(2)
# Kill application and exit scipt
os.system("taskkill /im AcroRd32.exe /f")

This seemed to work well for a large volume, ~2000 reports in a 3-4 hour period but I have some that drop off and I'm not sure if the script is getting overwhelmed or if I should look into multithreading or something else.

The fact that it handles such a large amount with no drop off leads me to believe that the issue is not with the script but I'm not sure if its an issue with the host system or Adobe Reader, or something else.

Any suggestions or opinions would be greatly appreciated.

dwtorres
  • 199
  • 3
  • 9
  • 1
    Is `win32api.ShellExecute()` synchronous? I.e. does it wait until AcrobatReader has finished printing? – Aaron Digulla Mar 13 '12 at 16:34
  • I don't believe so according to the documentation, that is why i set the timer, I can always expand the time frame. I thought about spawning processes so that i could tell when the application finishes. The problem is that I don't know when the print job actually prints on the printer. Its variable depending on network traffic, how long the command to print takes to process. – dwtorres Mar 13 '12 at 21:29
  • I'm a bit confused by your first sentence; what exactly doesn't work with the "free libraries and applications"? Is the printed PDF unreadable or doesn't it print at all or do they fail to create correct PDF? – Aaron Digulla Mar 14 '12 at 08:32
  • Another way of creating pdf from Python is converting HTML to PDF by using Pisa (www.xhtml2pdf.com/) – Don Mar 14 '12 at 08:35
  • Aaron, Its the header information and how the PRN/PCL file looks like when openened. The Free libraries are all encoded at least in my small time frame I had to research them. Acrobat's print files were not so regex functions could be written to extract specific data so the reports demographic data could be grabbed. I would much rather use ghostscript and gsprint but the print job file was not in the format for the regex's to work. – dwtorres Mar 15 '12 at 16:15
  • Is adobe/ some other software is mandatory to print a pdf file using win32api.ShellExecute(0, "print", pdf, None, ".", 0)? – Jisson May 29 '20 at 10:38

1 Answers1

2

Based on your feedback (win32api.ShellExecute() is probably not synchronous), your problem is the timeout: If your computer or the print queue is busy, the kill command can arrive too early.

If your script runs concurrently (i.e. you print all documents at once instead of one after the other), the kill command could even kill the wrong process (i.e. an acrobat process started by another invocation of the script).

So what you need it a better synchronization. There are a couple of things you can try:

  1. Convert this into a server script which starts Acrobat once, then sends many print commands to the same process and terminates afterwards.

  2. Use a global lock to make sure that ever only a single script is running. I suggest to create a folder somewhere; this is an atomic operation on every file system. If the folder exists, the script is active somewhere.

On top of that, you need to know when the job is finished. Use win32print.EnumJobs() for this.

If that fails, another solution could be to install a Linux server somewhere. You can run a Python server on this box which accepts print jobs that you send with the help of a small Python script on your client machine. The server can then print the PDFs for you in the background.

This approach allow you to add any kind of monitoring you like (sending mails if something fails or send a status mail after all jobs have finished).

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820