NodeJS: Merge two PDF files into one using the buffer obtained by reading them

Question

I am using fill-pdf npm module for filling template pdf's and it creates new file which is read from the disk and returned as buffer to callback. I have two files for which i do the same operation. I want to combine the two buffers there by to form a single pdf file which i can send back to the client. I tried different methods of buffer concatenation. The buffer can be concatenated using Buffer.concat, like,

var newBuffer = Buffer.concat([result_pdf.output, result_pdf_new.output]);

The size of new buffer is also the sum of the size of the input buffers. But still when the newBuffer is sent to client as response, it shows only the file mentioned last in the array.

res.type("application/pdf");
return res.send(buffer);

Any idea ?

Possible duplicate of [Merging PDFs in Node](https://stackoverflow.com/questions/41101395/merging-pdfs-in-node) — rouan, Aug 08 '17 at 12:11

Pankaj Shinde · Answer 1 · 2020-08-05T14:49:49.597

As mentioned by @MechaCode, the creator has ended support for HummusJS.

So I would like to give you 2 solutions.

Using node-pdftk npm module

The Following sample code uses node-pdftk npm module to combine two pdf buffers seamlessly.

const pdftk = require('node-pdftk');

var pdfBuffer1 = fs.readFileSync("./pdf1.pdf");
var pdfBuffer2 = fs.readFileSync("./pdf2.pdf");

pdftk
    .input([pdfBuffer1, pdfBuffer2])
    .output()
    .then(buf => {
        let path = 'merged.pdf';
        fs.open(path, 'w', function (err, fd) {
            fs.write(fd, buf, 0, buf.length, null, function (err) {
                fs.close(fd, function () {
                    console.log('wrote the file successfully');
                });
            });
        });
    });

The requirement for node-pdftk npm module is you need to install the PDFtk library. Some of you may find this overhead / tedious. So I have another solution using pdf-lib library.

Using pdf-lib npm module

const PDFDocument = require('pdf-lib').PDFDocument

var pdfBuffer1 = fs.readFileSync("./pdf1.pdf"); 
var pdfBuffer2 = fs.readFileSync("./pdf2.pdf");

var pdfsToMerge = [pdfBuffer1, pdfBuffer2]

const mergedPdf = await PDFDocument.create(); 
for (const pdfBytes of pdfsToMerge) { 
    const pdf = await PDFDocument.load(pdfBytes); 
    const copiedPages = await mergedPdf.copyPages(pdf, pdf.getPageIndices());
    copiedPages.forEach((page) => {
         mergedPdf.addPage(page); 
    }); 
} 

const buf = await mergedPdf.save();        // Uint8Array

let path = 'merged.pdf'; 
fs.open(path, 'w', function (err, fd) {
    fs.write(fd, buf, 0, buf.length, null, function (err) {
        fs.close(fd, function () {
            console.log('wrote the file successfully');
        }); 
    }); 
});

Personally I prefer to use pdf-lib npm module.

Arguably this is a better example for using PDF-LIB https://github.com/Hopding/pdf-lib#copy-pages — Richard Oliver Bray, Feb 15 '21 at 17:27

score 32 · Accepted Answer · answered Apr 27 '18 at 13:32

32

HummusJS supports combining PDFs using its appendPDFPagesFromPDF method

Example using streams to work with buffers:

const hummus = require('hummus');
const memoryStreams = require('memory-streams');

/**
 * Concatenate two PDFs in Buffers
 * @param {Buffer} firstBuffer 
 * @param {Buffer} secondBuffer 
 * @returns {Buffer} - a Buffer containing the concactenated PDFs
 */
const combinePDFBuffers = (firstBuffer, secondBuffer) => {
    var outStream = new memoryStreams.WritableStream();

    try {
        var firstPDFStream = new hummus.PDFRStreamForBuffer(firstBuffer);
        var secondPDFStream = new hummus.PDFRStreamForBuffer(secondBuffer);

        var pdfWriter = hummus.createWriterToModify(firstPDFStream, new hummus.PDFStreamForResponse(outStream));
        pdfWriter.appendPDFPagesFromPDF(secondPDFStream);
        pdfWriter.end();
        var newBuffer = outStream.toBuffer();
        outStream.end();

        return newBuffer;
    }
    catch(e){
        outStream.end();
        throw new Error('Error during PDF combination: ' + e.message);
    }
};

combinePDFBuffers(PDFBuffer1, PDFBuffer2);

answered Apr 27 '18 at 13:32

Zach Esposito

707
9
17

Usage please, what is the type of PDFBuffer1, PDFBuffer2 – M.Abulsoud Apr 11 '19 at 10:08
1

@M.Abulsoud They are both [Buffer](https://nodejs.org/api/buffer.html#buffer_buffer)s filled with binary PDF data. In my case I created the buffers using the [page.pdf()](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagepdfoptions) method of [Puppeteer](https://github.com/GoogleChrome/puppeteer) – Zach Esposito Apr 14 '19 at 19:34
1

Here is an example of merging pdfs from the author: https://github.com/galkahana/HummusJS/blob/master/tests/MergePDFPages.js – Govind Rai Apr 19 '19 at 17:03
I'm trying to use this in a React component, but it seems impossible. I missing something? – Gioce90 Jun 30 '19 at 21:47
@ZachEsposito Puppeteer's page.pdf() returns a Buffer – bentael Oct 08 '19 at 16:15
I'm getting files from firebase cloud storage like that `const [files] = await storage.bucket(bucket).getFiles(options);` Can I iterate through this array to merge these array items? What are they? Buffers? – Madcap Jan 31 '20 at 23:27

Hugh Secker-Walker · Answer 3 · 2019-07-16T16:43:09.270

Here's what we use in our Express server to merge a list of PDF blobs.

const { PDFRStreamForBuffer, createWriterToModify, PDFStreamForResponse } = require('hummus');
const { WritableStream } = require('memory-streams');

// Merge the pages of the pdfBlobs (Javascript buffers) into a single PDF blob                                                                                                                                                                  
const mergePdfs = pdfBlobs => {
  if (pdfBlobs.length === 0) throw new Error('mergePdfs called with empty list of PDF blobs');
  // This optimization is not necessary, but it avoids the churn down below                                                                                                                                                
  if (pdfBlobs.length === 1) return pdfBlobs[0];

  // Adapted from: https://stackoverflow.com/questions/36766234/nodejs-merge-two-pdf-files-into-one-using-the-buffer-obtained-by-reading-them?answertab=active#tab-top                                                     
  // Hummus is useful, but with poor interfaces -- E.g. createWriterToModify shouldn't require any PDF stream                                                                                                              
  // And Hummus has many Issues: https://github.com/galkahana/HummusJS/issues                                                                                                                                              
  const [firstPdfRStream, ...restPdfRStreams] = pdfBlobs.map(pdfBlob => new PDFRStreamForBuffer(pdfBlob));
  const outStream = new WritableStream();
  const pdfWriter = createWriterToModify(firstPdfRStream, new PDFStreamForResponse(outStream));
  restPdfRStreams.forEach(pdfRStream => pdfWriter.appendPDFPagesFromPDF(pdfRStream));
  pdfWriter.end();
  outStream.end();
  return outStream.toBuffer();
};

module.exports = exports = {
  mergePdfs,
};

Works like a charm!, if you need to merge base64 representation of pdfs into a single one (without the usage of files) then you need to pass the pdfBlobs as this `Buffer.from(base64String, 'base64')` — Rodrigo García, Feb 19 '22 at 01:51

NodeJS: Merge two PDF files into one using the buffer obtained by reading them

3 Answers3

Linked