89

I want to merge many PDF files into one using PDFBox and this is what I've done:

PDDocument document = new PDDocument();
for (String pdfFile: pdfFiles) {
    PDDocument part = PDDocument.load(pdfFile);
    List<PDPage> list = part.getDocumentCatalog().getAllPages();
    for (PDPage page: list) {
        document.addPage(page);
    }
    part.close();
}
document.save("merged.pdf");
document.close();

Where pdfFiles is an ArrayList<String> containing all the PDF files.

When I'm running the above, I'm always getting:

org.apache.pdfbox.exceptions.COSVisitorException: Bad file descriptor

Am I doing something wrong? Is there any other way of doing it?

Lipis
  • 21,388
  • 20
  • 94
  • 121
  • 1
    Somebody pointed out iText [http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html] and then deleted the answer. It worked and thanks for that. – Lipis Aug 27 '10 at 15:20
  • The [link](http://java-x.blogspot.de/2006/11/merge-pdf-files-with-itext.html) might help someone looking out for an answer. – java_learner Sep 27 '13 at 13:15

7 Answers7

157

Why not use the PDFMergerUtility of pdfbox?

PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();
cherouvim
  • 31,725
  • 15
  • 104
  • 153
30

A quick Google search returned this bug: "Bad file descriptor while saving a document w. imported PDFs".

It looks like you need to keep the PDFs to be merged open, until after you have saved and closed the combined PDF.

Trinimon
  • 13,839
  • 9
  • 44
  • 60
Michael Lloyd Lee mlk
  • 14,561
  • 3
  • 44
  • 81
  • 5
    Even though the post was two years old, this solved the problem. You have to keep them open! – Lipis Aug 27 '10 at 15:28
16

This is a ready to use code, merging four pdf files with itext.jar from http://central.maven.org/maven2/com/itextpdf/itextpdf/5.5.0/itextpdf-5.5.0.jar, more on http://tutorialspointexamples.com/

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;

/**
 * This class is used to merge two or more 
 * existing pdf file using iText jar.
 */
public class PDFMerger {

static void mergePdfFiles(List<InputStream> inputPdfList,
        OutputStream outputStream) throws Exception{
    //Create document and pdfReader objects.
    Document document = new Document();
    List<PdfReader> readers = 
            new ArrayList<PdfReader>();
    int totalPages = 0;

    //Create pdf Iterator object using inputPdfList.
    Iterator<InputStream> pdfIterator = 
            inputPdfList.iterator();

    // Create reader list for the input pdf files.
    while (pdfIterator.hasNext()) {
            InputStream pdf = pdfIterator.next();
            PdfReader pdfReader = new PdfReader(pdf);
            readers.add(pdfReader);
            totalPages = totalPages + pdfReader.getNumberOfPages();
    }

    // Create writer for the outputStream
    PdfWriter writer = PdfWriter.getInstance(document, outputStream);

    //Open document.
    document.open();

    //Contain the pdf data.
    PdfContentByte pageContentByte = writer.getDirectContent();

    PdfImportedPage pdfImportedPage;
    int currentPdfReaderPage = 1;
    Iterator<PdfReader> iteratorPDFReader = readers.iterator();

    // Iterate and process the reader list.
    while (iteratorPDFReader.hasNext()) {
            PdfReader pdfReader = iteratorPDFReader.next();
            //Create page and add content.
            while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
                  document.newPage();
                  pdfImportedPage = writer.getImportedPage(
                          pdfReader,currentPdfReaderPage);
                  pageContentByte.addTemplate(pdfImportedPage, 0, 0);
                  currentPdfReaderPage++;
            }
            currentPdfReaderPage = 1;
    }

    //Close document and outputStream.
    outputStream.flush();
    document.close();
    outputStream.close();

    System.out.println("Pdf files merged successfully.");
}

public static void main(String args[]){
    try {
        //Prepare input pdf file list as list of input stream.
        List<InputStream> inputPdfList = new ArrayList<InputStream>();
        inputPdfList.add(new FileInputStream("..\\pdf\\pdf_1.pdf"));
        inputPdfList.add(new FileInputStream("..\\pdf\\pdf_2.pdf"));
        inputPdfList.add(new FileInputStream("..\\pdf\\pdf_3.pdf"));
        inputPdfList.add(new FileInputStream("..\\pdf\\pdf_4.pdf"));


        //Prepare output stream for merged pdf file.
        OutputStream outputStream = 
                new FileOutputStream("..\\pdf\\MergeFile_1234.pdf");

        //call method to merge pdf files.
        mergePdfFiles(inputPdfList, outputStream);     
    } catch (Exception e) {
        e.printStackTrace();
    }
    }
}
benito
  • 655
  • 6
  • 12
  • 5
    The question clearly is tagged [tag:pdfbox]. You present a solution for [tag:itext]. Thus, your answer is off-topic. (That been said, your iText solution also is a bad one which iText developers usually recommend against as it drops interactive features and ignores rotation and page sizes.) – mkl Jan 09 '17 at 21:59
  • 2
    Then the title should be "How to merge two PDF files into one in Java with PdfBox" – Lluis Martinez Dec 11 '17 at 11:35
  • 3
    In addition iText comes with a nasty license. – Daniel Bo Jan 16 '18 at 15:54
  • This example is named IncorrectExample because this is not how you wou typically solve the problem of rotating pages, nor of merging documents. The correct way is NOT to use Document / PdfWriter, but to use PdfStamper, PdfCopy or PdfSmartCopy. However: in the question mentioned above, the circumstances are very particular and using this example in those circumstances is justified: https://developers.itextpdf.com/examples/merging-pdf-documents-itext5/how-not-merge-documents – victorpacheco3107 Mar 16 '18 at 16:23
  • @victorpacheco3107 what do you feel are the downsides to using `Document`/`PdfWriter` for merging? Even though I need to just make a new PDF copy of input PDF bytes (to solve the input bytes coming from a "corrupted" PDF), this sort of merging solution seemed to work for my purpose; but I will next try to refactor to your suggestion to use `PdfStamper`/`PdfCopy`/`PdfSmartCopy` instead, to try to be more correct as you say. I'm just curious what you think are the downsides to not doing that "correct" way; thanks – cellepo Jun 08 '22 at 00:58
  • Example using `PdfCopy` (I'm also curious why it doesn't need to open the PdfCopy): https://knpcode.com/java-programs/merging-pdfs-java-using-openpdf/ – cellepo Jun 09 '22 at 22:23
8

Multiple pdf merged method using org.apache.pdfbox:

public void mergePDFFiles(List<File> files,
                          String mergedFileName) {
    try {
        PDFMergerUtility pdfmerger = new PDFMergerUtility();
        for (File file : files) {
            PDDocument document = PDDocument.load(file);
            pdfmerger.setDestinationFileName(mergedFileName);
            pdfmerger.addSource(file);
            pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
            document.close();
        }
    } catch (IOException e) {
        logger.error("Error to merge files. Error: " + e.getMessage());
    }
}

From main program, call mergePDFFiles method using list of files and target file name.

        String mergedFileName = "Merged.pdf";
        mergePDFFiles(files, mergedFileName);

After calling mergePDFFiles, load merged file

        File mergedFile = new File(mergedFileName);
arifng
  • 726
  • 1
  • 12
  • 22
4
package article14;

import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFMergerUtility;

public class Pdf
{
    public static void main(String args[])
    {
        new Pdf().createNew();
        new Pdf().combine();
        }

    public void combine()
    {
        try
        {
        PDFMergerUtility mergePdf = new PDFMergerUtility();
        String folder ="pdf";
        File _folder = new File(folder);
        File[] filesInFolder;
        filesInFolder = _folder.listFiles();
        for (File string : filesInFolder)
        {
            mergePdf.addSource(string);    
        }
    mergePdf.setDestinationFileName("Combined.pdf");
    mergePdf.mergeDocuments();
        }
        catch(Exception e)
        {

        }  
    }

public void createNew()
{
    PDDocument document = null;
    try
    {
        String filename="test.pdf";
        document=new PDDocument();
        PDPage blankPage = new PDPage();
        document.addPage( blankPage );
        document.save( filename );
    }
    catch(Exception e)
    {

    }
}

}
Sabapathy
  • 1,613
  • 3
  • 18
  • 30
2

If you want to combine two files where one overlays the other (example: document A is a template and document B has the text you want to put on the template), this works:

after creating "doc", you want to write your template (templateFile) on top of that -

   PDDocument watermarkDoc = PDDocument.load(getServletContext()
                .getRealPath(templateFile));
   Overlay overlay = new Overlay();

   overlay.overlay(watermarkDoc, doc);
Dave W
  • 41
  • 1
2

Using iText (existing PDF in bytes)

    public static byte[] mergePDF(List<byte[]> pdfFilesAsByteArray) throws DocumentException, IOException {

    ByteArrayOutputStream outStream = new ByteArrayOutputStream();
    Document document = null;
    PdfCopy writer = null;

    for (byte[] pdfByteArray : pdfFilesAsByteArray) {

        try {
            PdfReader reader = new PdfReader(pdfByteArray);
            int numberOfPages = reader.getNumberOfPages();

            if (document == null) {
                document = new Document(reader.getPageSizeWithRotation(1));
                writer = new PdfCopy(document, outStream); // new
                document.open();
            }
            PdfImportedPage page;
            for (int i = 0; i < numberOfPages;) {
                ++i;
                page = writer.getImportedPage(reader, i);
                writer.addPage(page);
            }
        }

        catch (Exception e) {
            e.printStackTrace();
        }

    }

    document.close();
    outStream.close();
    return outStream.toByteArray();

}
  • The OP quite clearly said *"I want to merge many PDF files into one using PDFBox"*. iText is not PDFBox. – mkl May 18 '18 at 21:13