0

I receive two PDFs, each as a byte array. So now I have 2 arrays, a[] and b[]. I concatenate them and save them to c[]. When I convert c[] to a PDF, only the 2nd file shows up. When I check the length of c[], it is len(a[]) + len(b[]).

I found other questions about this for different programming languages, and they say that I can't just concatenate them like this, we need to use a PDF authoring library. Since I receive byte arrays to begin with, is there anything else that could work in my situation?

sudhansh_
  • 125
  • 1
  • 2
  • 14
  • They are right. Just as concatenating the bytes of two images will not give a larger image. But PDF libraries may read a byte array then two PDF objects may be merged. There often exist examples. – Joop Eggen Dec 08 '20 at 16:49
  • 1
    You can use openpdf | https://github.com/LibrePDF/OpenPDF/wiki/Tutorial to load both documents(byteArray) and then do the merge of individual pages. Also if you can use thirdparty libraries you can use pdfbox as in this question https://stackoverflow.com/questions/37589590/merge-pdf-files-using-pdfbox – Akin Okegbile Dec 08 '20 at 16:51
  • We're not keeping the process real time anymore, changing it to a batch job, so that takes care of things. Thank you both for your time. I was going through OpenPDF when my lead realized it wasn't required of us to do it in real time. – sudhansh_ Dec 08 '20 at 18:25

2 Answers2

2

You can't just concatenate the byte arrays.

You can find a couple of solutions for merging PDF files here How to merge two PDF files into one in Java?

If you have the PDF files, you can just use PDFMergerUtility of pdfbox.

PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();

If the PDF files are not available, you can just use the other solution with itext

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;

/**
 * This class is used to merge two or more 
 * existing pdf file using iText jar.
 */
public class PDFMerger {

   static void mergePdfFiles(List<InputStream> inputPdfList,
                             OutputStream outputStream) throws Exception{
      //Create document and pdfReader objects.
      Document document = new Document();
      List<PdfReader> readers = 
              new ArrayList<PdfReader>();
      int totalPages = 0;

      //Create pdf Iterator object using inputPdfList.
      Iterator<InputStream> pdfIterator = 
          inputPdfList.iterator();

      // Create reader list for the input pdf files.
      while (pdfIterator.hasNext()) {
          InputStream pdf = pdfIterator.next();
          PdfReader pdfReader = new PdfReader(pdf);
          readers.add(pdfReader);
          totalPages = totalPages + pdfReader.getNumberOfPages();
      }

      // Create writer for the outputStream
      PdfWriter writer = PdfWriter.getInstance(document, outputStream);

      //Open document.
      document.open();

      //Contain the pdf data.
      PdfContentByte pageContentByte = writer.getDirectContent();

      PdfImportedPage pdfImportedPage;
      int currentPdfReaderPage = 1;
      Iterator<PdfReader> iteratorPDFReader = readers.iterator();

      // Iterate and process the reader list.
      while (iteratorPDFReader.hasNext()) {
        PdfReader pdfReader = iteratorPDFReader.next();
        //Create page and add content.
        while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
              document.newPage();
              pdfImportedPage = 
              writer.getImportedPage(pdfReader,currentPdfReaderPage);
              pageContentByte.addTemplate(pdfImportedPage, 0, 0);
              currentPdfReaderPage++;
        }
        currentPdfReaderPage = 1;
     }

     //Close document and outputStream.
     outputStream.flush();
     document.close();
     outputStream.close();

     System.out.println("Pdf files merged successfully.");
   }

}
Nenad
  • 484
  • 3
  • 14
  • I spoke to my company about this. They're not looking at paying for software licenses at the moment. We're not keeping the process real time anymore, changing it to a batch job, so that takes care of things. Thank you for your time taken to answer this. – sudhansh_ Dec 08 '20 at 18:24
  • Hi Sudansh, am into same situation . not able to concat two byte arrays into 3rd array and print pdf of 3rd array. Any suggestions how to achieve this. – Teja K May 17 '22 at 08:10
1

If anyone still looking for such solution, try this:

//Suppose we want to merge one pdf with another main pdf

          InputStream is1 = null;



          if (file1 != null) {

                 FileInputStream fis1 = new FileInputStream(file1);

                 byte[] file1Data = new byte[(int) file1.length()];

                 fis1.read(file1Data);

                 is1 = new java.io.ByteArrayInputStream(file1Data);

          }



          //

          InputStream mainContent = <ur main content>



          org.apache.pdfbox.pdmodel.PDDocument mergedPDF = new org.apache.pdfbox.pdmodel.PDDocument();

          org.apache.pdfbox.pdmodel.PDDocument mainDoc = org.apache.pdfbox.pdmodel.PDDocument.load(mainContent);

          org.apache.pdfbox.multipdf.PDFMergerUtility merger = new org.apache.pdfbox.multipdf.PDFMergerUtility();



          merger.appendDocument(mergedPDF, mainDoc);



          PDDocument doc1 = null;



          if (is1 != null) {

                 doc1 = PDDocument.load(is1);

                 merger.appendDocument(mergedPDF, doc1);

                //1st file appended to main pdf");

          }

         



          ByteArrayOutputStream baos = new ByteArrayOutputStream();

          mergedPDF.save(baos);

//Now either u save it here or convert into InputStream if u want

          ByteArrayInputStream mergedInputStream = new ByteArrayInputStream(baos.toByteArray());
Samit
  • 74
  • 6