68

I have two pdf or postscript files (I can work with either one). What I want to do is merge each page on top of the other so that page1 of document A will be combined with page 1 of document B to produce page 1 of the output document. This isn't something I necessarily want need to do programatically, although that would be helpful.

Any ideas?

JohnnyLambada
  • 12,700
  • 11
  • 57
  • 61
  • See a [similar question](http://stackoverflow.com/questions/310441/create-two-pdfs-from-one-ps-file) here on Stackoverflow and why this is difficult. – Christian Lindig Feb 02 '09 at 19:39
  • 1
    that's not really related to this issue. OP says they can work directly with PDFs. It's not really difficult anyway. – danio Feb 04 '09 at 11:28

11 Answers11

99

You can do this with pdf files using the command line tool pdftk using the stamp or background option.

e.g.

$ pdftk file1.pdf background file2.pdf output combinedfile.pdf

This will only work with a one-page background file. If you have multiple pages, you can use the multibackground command instead.

bmb
  • 6,058
  • 2
  • 37
  • 58
  • 7
    Thanks, the background option worked for me :) Just to clarify, file1.pdf is placed above file2.pdf. Thanks! – AkiRoss Dec 29 '10 at 17:17
  • 1
    pdftk on HP-UX Itanium 11.31 ia64 fails to run. $(hostname):>pdftk PclConvertedToPDF.PDF stamp sourcePDFShifted.PDF output FinalPackList.pdf [HP ARIES32]: Core file for 32-bit PA-RISC application [HP ARIES32]: /usr/local/bin/pdftk saved to /core.pdftk Memory fault(coredump) Any idea of fixing this ? – MoG May 19 '15 at 17:45
  • 2
    The figures I'm trying to superimpose have a white background instead of a clear background... is there a way to ask this command to treat white as transparent? Thanks. – Mark Aug 06 '15 at 14:29
  • 1
    However new Ubuntu versions do not include pdftk, so you have to "pollute" your repositories with ppa's. – ilias iliadis Jul 23 '19 at 14:15
  • Note, you can place file2.pdf UNDER file1.pdf using the "stamp" command instead of the "background command". – whiskeychief Jul 25 '19 at 21:37
  • 1
    @Mark probably you don't need this after 7 years, but for others finding this answer, the PDF do not have "backgrounds", only object. Often a square, white rectangle is used as "background" and superimposing files this way will have the rectangle from the second PDF cover all or some of the objects from the first PDF file. Editing the combined file in something like LibreOffice Draw one can fairly easily remove that unwanted object. – Davide Jul 04 '22 at 20:49
  • @Davide Any way to do that programmatically? I earlier used Inkscape to delete the object, but it is crashing due to a double free error, and anyways manual way is too bothersome. I just want to stamp page numbers. – Anonymous Guy Aug 27 '22 at 15:22
  • @AnonymousGuy to do it programmatically, you'd need to know something about those "pretend" background object. Since each PDF file does its own thing depending on its creator (sw and person), I never investigated on how to do that, but I think PDFTK would have an option for that, just jump into the rabbit hole of its documentation and good luck! Please post here what you found (even if it's just a "nothing", but hopefully it'll be the solution) – Davide Aug 28 '22 at 21:25
22

I had success solving this problem (PDF only and Python) by using pyPdf, specifically the mergePage operation.

From the docs:

# add page 4 from input1, but first add a watermark from another pdf:
page4 = input1.getPage(3)
watermark = PdfFileReader(file("watermark.pdf", "rb"))
page4.mergePage(watermark.getPage(0))

Should be enough to get the idea.

pi.
  • 21,112
  • 8
  • 38
  • 59
  • watermark.mergePage(page4) if you want the watermark behind the text. – Ale Aug 02 '12 at 15:32
  • 1
    This was how I started - I was not amused by the lengths PyPDF2 went to to merge pages. PDF page contents can be an array: make sure it is for the page you want rendered first, append all content streams of the page you want rendered after/on top of that. Handling "the boxes" is another interesting can of worms… – greybeard Jan 25 '18 at 21:39
8

Many of the answers here seem extremely out of date. This is especially so for the accepted answer, which uses pdftk. Pdftk is old (as of this edit, the latest release is from 2013), has terrible error handling, requires old versions of libraries, and is no longer packaged for ubuntu. Nobody in 2022 should be using pdftk for anything if they can possibly avoid it.

Qpdf is a nice open-source program that does many of the same things as pdftk, is still maintained (as of this edit, the latest commit is from 4 days ago) and it can do this task (using either --overlay or --underlay, depending on one's wishes):

qpdf a.pdf --overlay b.pdf -- c.pdf

[Edited to add] As a further example, here's a bit of ruby code using the prawn library to generate one example way this could be used.

The idea is to generate numbered labels for printing onto some hypothetical label sheet (represented by boxes.pdf — actual dimensions are made up, but could be adjusted to fit an existing template if desired), and to have a version that shows the label dimensions visually, for on-screen review (boxednumbers.pdf), while another version (numbers.pdf) would have just the content to print onto an actual sheet of pre-cut labels.

(Note: Presumably in a real-life situation, boxes.pdf might not be generated, and instead be a downloaded template from a label manufacturer — and the qpdf command-line (as well as the dimensioning constants) would be adjusted accordingly, with the outer Prawn::Document.generate block being removed, paper size possibly being modified, etc.)

#!/usr/bin/env ruby

require 'prawn'
require 'prawn/measurement_extensions'

# various constants to define the size and shape of things:
range = 1..90
columns = 5
col_width = 35.mm
box_height = 15.mm
margin = 2.mm
radius = 2.mm

# this is calculated for now, but it could be set manuallly:
font_size = box_height - 2*margin

# one document for showing the boxes (could use an existing PDF):
Prawn::Document.generate('boxes.pdf', page_size: 'A4') do |boxes|
  # another that places numbers to be printed on top:
  Prawn::Document.generate('numbers.pdf', page_size: 'A4') do |numbers|
    range.each_with_index do |n, i|
      # x and y for columner output:
      x = (i % columns) * col_width
      y = boxes.bounds.top - (i / columns) * box_height
      # sizes of the boxes inside the columns:
      width = col_width - margin
      height = box_height - margin
      # draw round rectangles in the boxes PDF:
      boxes.stroke do
        boxes.rounded_rectangle [x, y], width, height, radius
      end
      # draw numbers in the numbers PDF:
      numbers.bounding_box [x,y], width: width, height: height do
        numbers.text n.to_s, align: :center, valign: :center, size: font_size
      end
    end
  end
end

# and then ask qpdf to merge them, with the boxes as an underlay:
%x(qpdf --underlay boxes.pdf -- numbers.pdf boxednumbers.pdf)
lindes
  • 9,854
  • 3
  • 33
  • 45
user18021191
  • 81
  • 1
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 25 '22 at 09:37
  • 1
    Hey OP here -- happy to review this and perhaps make it the accepted answer. But as the other comments say, it'd be good to give more context / detail. The questions I have are (1) You're saying `pdftk` is old an no longer supported, can you link to evidence of this? (2) Can you link to `qpdf` on github? (3) (BONUS!) Can you link to an `a.pdf` `b.pdf` and resulting `c.pdf`? I know no one has done (3), but for me that would seal the deal for accepted answer. Thank you and welcome to Stack Overflow!! – JohnnyLambada Jan 27 '22 at 17:50
  • It seems likely that the user who posted this answer has since left the site... but I agree this answer could be improved in the ways you discuss. I'm planning an edit that will hopefully meet your requirements (though I'll submit code for generating a, b, and c, rather than linking to them -- this is a coding-oriented site, after all. ;) – lindes Feb 02 '23 at 16:40
4

2022 onwards

PDFTK is still a strong contender for many users with simple cross platform command lines for many tasks.

PDFtk Server does not require Adobe Acrobat or Reader, and it runs on Windows, Mac OS X and Linux.

You will see multiple commands in the other answers. But one drawback is that it needs to be licensed for distribution.

A commercial license is required to distribute PDFtk with your commercial product.

Another daily updated cross platform Solution is GhostScript (but Windows binaries are often bi-annual) and it too requires a commercial license.

Its primary strength is the ability to merge nominated pages from both PostScript AND PDF along with other formats not handled by PDFTK. However it does not profess to be the optimum PDF merge tool. Again there are many well documented ways in StackOverflow answers to use that grandfather application for different combinational tasks, but may requires more complex approaches than a quick one liner, and many users will combine PDL or PS handling with PDFTK (above) for final overlay.

So what is updated frequently, and is available for commercial use and does the task in a single call? The answer is QPDF (licensed under the Apache License, Version 2.0), which has both overlay and underlay options options with too many many other abilities to list, but is PDF only.

At its simplest one method is to write a new page from 2 others lets say just page 1, and not changing either source. The downside is that with so many options including passwords the syntax can become complex, good news is it may be overhauled in next version 11. Also beware it does not like file names with spaces.

First generate a new empty input file with page 1 from file1 (easy enough)

qpdf --empty --pages "file1.pdf" 1 -- "output.pdf"

Now overlay that with first page of file2 (this is a bit more complicated)

qpdf output.pdf --overlay "file2.pdf" --from=1 -- --replace-input

The commands can be combined in one simpler line and for this example lets use page 2 of file 1

qpdf --empty --pages "file1.pdf" 2 --  --overlay "file2.pdf" --from=1 -- "output2.pdf"

You will also need to be aware that mixing page sizes will work however perhaps not as expected so second on first is centred and reduced to fit.

enter image description here

K J
  • 8,045
  • 3
  • 14
  • 36
  • Save my life @k-j . Additional information: Order of files are important. Based on 1st document scaling is applied. Example 1: If pdfs have same width and 1st pdf has bigger height, 2nd pdf is centred horizontally and vertically. Example 2: If 1st pdf is smaller (content wise) 2nd pdf will scale to mach 1st content. Hope this will help somebody. – fearis Nov 17 '22 at 19:29
2

PDFbox for Java supports a Overlay class which allows to merge PDFs this way. See this answer: Watermarking with PDFBox

However, both PyPDF2 and PDFbox have been unreliable in my experience, but perhaps this is helpful for someone.

Community
  • 1
  • 1
Lenar Hoyt
  • 5,971
  • 6
  • 49
  • 59
2

If you're dealing with only postscript, chances are the only 'pagebreaks' are the 'showpage' operator.
In which case you can simply grab the postscript data from the beginning of file one to the first instance of 'showpage', do the same with the other file, then concatenate these 2 chunks of postscript to create your new page.

If the 2 files are only one page, then you may be able to simply join the 2 files.

Michael Galos
  • 1,065
  • 3
  • 13
  • 27
1

Aspose.Pdf.Kit with thePdfFileStamp class can do this, too. It works most of the time correctly.

Community
  • 1
  • 1
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
1

I used the Mac OS tool PDFClerk Pro. I imported the PDF pages, then merged them with the option "Merge Pages (Stacked)." It really impressed me.

J. B. Rainsberger
  • 1,193
  • 9
  • 23
0

For OS X there is PDF letterhead. Doesn't do anything else than just overlaying PDF's. https://itunes.apple.com/us/app/pdf-letterhead/id976548033?mt=12

mipmip
  • 1,072
  • 1
  • 10
  • 24
0

You could convert both pdfs into images and overlay one on top of the other layer like.

A suitable graphics library that you could use this would work.

Watermark suggestion above has great potential too as long as you don't run into issues in your language or graphics/pdf library of choice.

Jas Panesar
  • 6,597
  • 3
  • 36
  • 47
  • 3
    It's definitely a possible workaround, but you'd lose the scalable quality of any vector graphics in the files. A process that maintains the higher-level contents of the image model are generally to be preferred. – luser droog Apr 01 '14 at 07:46
  • Definitely possible. If you're looking to render to print anyways, the first time the merge is done it can be done in high enough quality as a last resort. There should also be a way to take the vector elements on each page and instead merge them onto one page. – Jas Panesar Apr 01 '14 at 15:31
-1

VeryPDF PDF Editor has a PDF Overlay function, look at this web page,

http://www.verypdf.com/wordpress/201304/how-to-overlay-pdf-to-another-pdf-35885.html

David
  • 1