156

I'm working on a project which takes some images from user and then creates a PDF file which contains all of these images.

Is there any way or any tool to do this in Python? E.g. to create a PDF file (or eps, ps) from image1 + image 2 + image 3 -> PDF file?

Stephen T.
  • 1,883
  • 2
  • 15
  • 11
  • 47
    When in doubt, prefix whatever you are searching for by `py` ;-) – mjv Feb 12 '10 at 15:17
  • 8
    Another SO search trick: `[language or tag] some_keyword` as in `[python] PDF` or `[python] PDF image` – mjv Feb 12 '10 at 15:19
  • For those coming here using matplolib: http://stackoverflow.com/questions/17788685/python-saving-multiple-figures-into-one-pdf-file – David Parks Sep 26 '16 at 19:05

14 Answers14

181

Here is my experience after following the hints on this page.

  1. pyPDF can't embed images into files. It can only split and merge. (Source: Ctrl+F through its documentation page) Which is great, but not if you have images that are not already embedded in a PDF.

  2. pyPDF2 doesn't seem to have any extra documentation on top of pyPDF.

  3. ReportLab is very extensive. (Userguide) However, with a bit of Ctrl+F and grepping through its source, I got this:

    • First, download the Windows installer and source
    • Then try this on Python command line:

      from reportlab.pdfgen import canvas
      from reportlab.lib.units import inch, cm
      c = canvas.Canvas('ex.pdf')
      c.drawImage('ar.jpg', 0, 0, 10*cm, 10*cm)
      c.showPage()
      c.save()
      

All I needed is to get a bunch of images into a PDF, so that I can check how they look and print them. The above is sufficient to achieve that goal.

ReportLab is great, but would benefit from including helloworlds like the above prominently in its documentation.

kynan
  • 13,235
  • 6
  • 79
  • 81
Evgeni Sergeev
  • 22,495
  • 17
  • 107
  • 124
  • 18
    I must say reportlab is the best for PDF generation that I have tried, definitely the most complete. However, it's also a bit more complicated. http://www.blog.pythonlibrary.org/2010/03/08/a-simple-step-by-step-reportlab-tutorial/ http://www.blog.pythonlibrary.org/2010/09/21/reportlab-tables-creating-tables-in-pdfs-with-python/ – Jose Salvatierra Jul 24 '13 at 16:27
  • 1
    This was exactly what i was looking for – Maarten May 20 '16 at 09:46
  • @JoseSalvatierra Thanks Jose...this is really easy. Thanks for the blog link. – Arindam Roychowdhury Feb 06 '19 at 11:49
  • PyPDF2 documentation: https://pypdf2.readthedocs.io/en/latest/ - you might want to update your answer :-) – Martin Thoma Dec 20 '22 at 17:59
  • If you are looking for a way to combine the power of Python and LaTeX for automated PDF generation, please see https://intuitivetutorial.com/2020/11/08/automated-pdf-creation/ – Sajil C K May 19 '23 at 03:44
42

I suggest Pdfkit. (installation guide)

It creates pdf from html files. I chose it to create pdf in 2 steps from my Python Pyramid stack:

  1. Rendering server-side with mako templates with the style and markup you want for you pdf document
  2. Executing pdfkit.from_string(...) method by passing the rendered html as parameter

This way you get a pdf document with styling and images supported.

You can install it as follows :

  • using pip

    pip install pdfkit

  • You will also need to install wkhtmltopdf (on Ubuntu).
Gonzalo Garcia
  • 6,192
  • 2
  • 29
  • 32
eton_ceb
  • 769
  • 8
  • 14
  • 1
    From a unit test perspective this methodology seems superior as it allows for unit testing of the html (what is expected to be seen). Not sure if that is within the scope of the OP, but it is a good note. – Chris Collett Jan 08 '21 at 21:06
41

I suggest pyPdf. It works really nice. I also wrote a blog post some while ago, you can find it here.

drevicko
  • 14,382
  • 15
  • 75
  • 97
Geo
  • 93,257
  • 117
  • 344
  • 520
15

fpdf works well for me. Much simpler than ReportLab and really free. Works with UTF-8.

mfs
  • 249
  • 2
  • 2
  • 3
    Link/Descrip.: http://www.fpdf.org/ FPDF is a PHP class which allows to generate PDF files with pure PHP, that is to say without using the PDFlib library. F from FPDF stands for Free: you may use it for any kind of usage and modify it to suit your needs. FPDF has other advantages: high level functions. Here is a list of its main features: Choice of measure unit, page format and margins, Page header and footer management, Automatic page break, Automatic line break and text justify, Image support (JPEG, PNG and GIF), Colors, Links, TrueType, Type1 and encoding support, Page compression – AnneTheAgile Oct 30 '14 at 16:05
  • 15
    Not very relevant considering the question was about Python, not PHP – KingRadical Jan 28 '15 at 00:20
  • 4
    why all this downvoting ? fpdf is available also for python. pip install fpdf works – user1981924 Mar 25 '18 at 21:46
  • 5
    fpdf might have started with php. But [there is a python](https://pyfpdf.readthedocs.io/en/latest/Tutorial/index.html) port which works really well. So I thinks this is a very relevant answer which deserve more up votes than down votes. (I am not sure about the situation when this answer was initially posted) – Anubis Feb 06 '19 at 20:45
14

You can try this(Python-for-PDF-Generation) or you can try PyQt, which has support for printing to pdf.

Python for PDF Generation

The Portable Document Format (PDF) lets you create documents that look exactly the same on every platform. Sometimes a PDF document needs to be generated dynamically, however, and that can be quite a challenge. Fortunately, there are libraries that can help. This article examines one of those for Python.

Read more at http://www.devshed.com/c/a/Python/Python-for-PDF-Generation/#whoCFCPh3TAks368.99

gruszczy
  • 40,948
  • 31
  • 128
  • 181
11

Here is a solution that works with only the standard packages. matplotlib has a PDF backend to save figures to PDF. You can create a figures with subplots, where each subplot is one of your images. You have full freedom to mess with the figure: Adding titles, play with position, etc. Once your figure is done, save to PDF. Each call to savefig will create another page of PDF.

Example below plots 2 images side-by-side, on page 1 and page 2.

from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
from scipy.misc import imread
import os
import numpy as np

files = [ "Column0_Line16.jpg", "Column0_Line47.jpg" ]
def plotImage(f):
    folder = "C:/temp/"
    im = imread(os.path.join(folder, f)).astype(np.float32) / 255
    plt.imshow(im)
    a = plt.gca()
    a.get_xaxis().set_visible(False) # We don't need axis ticks
    a.get_yaxis().set_visible(False)

pp = PdfPages("c:/temp/page1.pdf")
plt.subplot(121)
plotImage(files[0])
plt.subplot(122)
plotImage(files[1])
pp.savefig(plt.gcf()) # This generates page 1
pp.savefig(plt.gcf()) # This generates page 2
pp.close()
Anton Schwaighofer
  • 3,119
  • 11
  • 24
10

rinohtype supports embedding PDF, PNG and JPEG images (natively) and other bitmap formats (when Pillow is installed).

(Full disclosure: I am the author of rinohtype)

Brecht Machiels
  • 3,181
  • 3
  • 25
  • 38
  • 2
    Hey! Correct me if I'm wrong, but it seems that it's quite powerfull tool and unlike many, many others listed here is not a python wrapper for an acient php/ruby/perl/pyqt4/other crap library. – Mikaelblomkvistsson Feb 12 '19 at 08:10
  • Do you have some examples of things that have been generated with this? Is it general-purpose or really only made for technical manuals? Can it be used for business reporting, for example? – wordsforthewise Jun 12 '21 at 01:10
  • @wordsforthewise For now just the rinohtype PDF manuals linked from the sidebar at http://www.mos6581.org/rinohtype/master/. It's mainly aimed at flowing a stream of content onto pages (versus placing text/images at specific coordinates). I guess business reporting falls under that use case? – Brecht Machiels Jun 24 '21 at 15:53
8

fpdf is python (too). And often used. See PyPI / pip search. But maybe it was renamed from pyfpdf to fpdf. From features: PNG, GIF and JPG support (including transparency and alpha channel)

mirek
  • 1,140
  • 11
  • 10
  • 2
    Your answer is unclear, but thyere's certainly PyFPDF pfoject https://pyfpdf.readthedocs.io/en/latest/ – Wojciech Kaczmarek Jul 26 '16 at 14:34
  • All the confusion in the naming is a really a pity. This answer and the comment by @WojciechKaczmarek really deserve more upvotes and attention. PyFPDF is a python port of an often used PDF library originally written in PHP. – Ideogram May 07 '19 at 16:30
8

If you are familiar with LaTex you might want to consider pylatex

One of the advantages of pylatex is that it is easy to control the image quality. The images in your pdf will be of the same quality as the original images. When using reportlab, I experienced that the images were automatically compressed, and the image quality reduced.

The disadvantage of pylatex is that, since it is based on LaTex, it can be hard to place images exactly where you want on the page. However, I have found that using the position argument in the Figure class, and sometimes Subfigure, gives good enough results.

Example code for creating a pdf with a single image:

from pylatex import Document, Figure

doc = Document(documentclass="article")
with doc.create(Figure(position='p')) as fig:
fig.add_image('Lenna.png')

doc.generate_pdf('test', compiler='latexmk', compiler_args=["-pdf", "-pdflatex=pdflatex"], clean_tex=True)

In addition to installing pylatex (pip install pylatex), you need to install LaTex. For Ubuntu and other Debian systems you can run sudo apt-get install texlive-full. If you are using Windows I would recommend MixTex

larsjr
  • 665
  • 7
  • 17
7

I believe that matplotlib has the ability to serialize graphics, text and other objects to a pdf document.

Andrea
  • 735
  • 2
  • 8
  • 12
  • 1
    Yes, you can. [This SO answer](http://stackoverflow.com/a/12939210/420867) has some good links on how to do it. – drevicko Feb 14 '14 at 11:48
7

I have done this quite a bit in PyQt and it works very well. Qt has extensive support for images, fonts, styles, etc and all of those can be written out to pdf documents.

Allen
  • 3,134
  • 5
  • 29
  • 49
  • 2
    Wow, Qt looks amazing. They say they support 15 plaforms, inc. Windows, Mac OS X, Linux, Android, iOS, Windows RT plus these Real-Time Operating Systems- INTEGRITY QNX VxWorks http://www.qt.io/qt-framework/ . And, since I'm a python fan, I like "PyQt combines all the advantages of Qt and Python. A programmer has all the power of Qt, but is able to exploit it with the simplicity of Python. " http://www.riverbankcomputing.co.uk/software/pyqt/intro – AnneTheAgile Oct 30 '14 at 16:28
  • @AnneTheAgile I can't help myself to comment with "*Qt IS amazing*". – Oak_3260548 Aug 13 '20 at 12:49
7

I use rst2pdf to create a pdf file, since I am more familiar with RST than with HTML. It supports embedding almost any kind of raster or vector images.

It requires reportlab, but I found reportlab is not so straight forward to use (at least for me).

maazza
  • 7,016
  • 15
  • 63
  • 96
ismailsunni
  • 1,458
  • 1
  • 24
  • 32
5

You can actually try xhtml2pdf http://flask.pocoo.org/snippets/68/

zeroc00l
  • 61
  • 1
  • 4
3

It depends on what format your image files are in, but for a project here at work I used the tiff2pdf tool in LibTIFF from RemoteSensing.org. Basically just used subprocess to call tiff2pdf.exe with the appropriate argument to read the kind of tiff I had and output the kind of pdf I wanted. If they aren't tiffs you could probably convert them to tiffs using PIL, or maybe find a tool more specific to your image type (or more generic if the images will be diverse) like ReportLab mentioned above.

Tofystedeth
  • 400
  • 3
  • 8