How to merge multiple pdf files (generated in run time)?

Question

How to merge multiple pdf files (generated on run time) through ItextSharp then printing them.

I found the following link but that method requires the pdf names considering that the pdf files stored and this is not my case .

I have multiple reports i'll convert them to pdf files through this method :

private void AddReportToResponse(LocalReport followsReport)
{
    string mimeType;
    string encoding;
    string extension;
    string[] streams = new string[100];
    Warning[] warnings = new Warning[100];
    byte[] pdfStream = followsReport.Render("PDF", "", out mimeType, out encoding, out extension, out streams, out warnings);
  //Response.Clear();
  //Response.ContentType = mimeType;
  //Response.AddHeader("content-disposition", "attachment; filename=Application." + extension);
  //Response.BinaryWrite(pdfStream);
  //Response.End();
}

Now i want to merge all those generated files (Bytes) in one pdf file to print it

I think question is similar to this: http://stackoverflow.com/questions/3961804/itextsharp-creation-of-a-pdf-from-a-list-of-byte-arrays — Prakash Chennupati, Apr 10 '13 at 13:42
The samples you found and the other commenters pointed you to, use `PdfReader` to read the source documents. `PdfReader` has multiple constructors, some take a file name string as argument, some byte arrays containing the PDF; in your case use the latter ones. From the merging samples found, though, please don't choose one using `PdfWriter` but instead one using `PdfCopy, PdfSmartCopy, PdfCopyFields,` or `PdfCopyForms` (the choice depending on your exact requirements). — mkl, Apr 10 '13 at 14:14
Why U don't recommend `PdfWriter` ?? When i use `PdfSmartCopy` instead for example i get the following exception `document has no pages` !! — Anyname Donotcare, Apr 10 '13 at 15:14
@just_name When you use `PdfWriter` to merge source PDFs, interactive features (forms and other annotations) are lost. Furthermore the resulting PDF internally contains an unnecessary wrapper around the page information which when iterated multiple times may cause PDF viewers to fail when trying to display the PDF. — mkl, Apr 10 '13 at 15:18
@mkl:Thanks a lot ,but when i use `PdfSmartCopy` instead for example i get the following exception `document has no pages` !! — Anyname Donotcare, Apr 10 '13 at 15:20
in the previous http://stackoverflow.com/a/3980398/418343 any recommendation — Anyname Donotcare, Apr 10 '13 at 15:21
Can this be applied to multiple .RDL (SSRS) report file? Has any one tried? — AskMe, May 03 '19 at 03:30

score 53 · Accepted Answer · answered Apr 11 '13 at 09:39

53

If you want to merge source documents using iText(Sharp), there are two basic situations:

You really want to merge the documents, acquiring the pages in their original format, transfering as much of their content and their interactive annotations as possible. In this case you should use a solution based on a member of the Pdf*Copy* family of classes.
You actually want to integrate pages from the source documents into a new document but want the new document to govern the general format and don't care for the interactive features (annotations...) in the original documents (or even want to get rid of them). In this case you should use a solution based on the PdfWriter class.

You can find details in chapter 6 (especially section 6.4) of iText in Action — 2nd Edition. The Java sample code can be accessed here and the C#'ified versions here.

A simple sample using PdfCopy is Concatenate.java / Concatenate.cs. The central piece of code is:

byte[] mergedPdf = null;
using (MemoryStream ms = new MemoryStream())
{
    using (Document document = new Document())
    {
        using (PdfCopy copy = new PdfCopy(document, ms))
        {
            document.Open();

            for (int i = 0; i < pdf.Count; ++i)
            {
                PdfReader reader = new PdfReader(pdf[i]);
                // loop over the pages in that document
                int n = reader.NumberOfPages;
                for (int page = 0; page < n; )
                {
                    copy.AddPage(copy.GetImportedPage(reader, ++page));
                }
            }
        }
    }
    mergedPdf = ms.ToArray();
}

Here pdf can either be defined as a List<byte[]> immediately containing the source documents (appropriate for your use case of merging intermediate in-memory documents) or as a List<String> containing the names of source document files (appropriate if you merge documents from disk).

An overview at the end of the referenced chapter summarizes the usage of the classes mentioned:

PdfCopy: Copies pages from one or more existing PDF documents. Major downsides: PdfCopy doesn’t detect redundant content, and it fails when concatenating forms.
PdfCopyFields: Puts the fields of the different forms into one form. Can be used to avoid the problems encountered with form fields when concatenating forms using PdfCopy. Memory use can be an issue.
PdfSmartCopy: Copies pages from one or more existing PDF documents. PdfSmartCopy is able to detect redundant content, but it needs more memory and CPU than PdfCopy.
PdfWriter: Generates PDF documents from scratch. Can import pages from other PDF documents. The major downside is that all interactive features of the imported page (annotations, bookmarks, fields, and so forth) are lost in the process.

answered Apr 11 '13 at 09:39

mkl

90,588
15
125
265

It really is interesting to see that someone downvoted this answer without leaving a comment explaining deficiencies... That been said, iText has developed meanwhile and much of the `PdfCopyFields` specific stuff has found its way into `PdfCopy`. – mkl Dec 03 '13 at 07:58
1

@BonusKun *It mergers really slow* - if it is exceptionally slow, you might want to create a question in its own right providing sample documents to reproduce the problem... *(I upvoted btw.)* - thanx! – mkl Oct 09 '14 at 07:42
If you want to save the file to disk, after line "mergedPdf = ms.ToArray();", use: System.IO.File.WriteAllBytes(@"C:\MyFileName.pdf", mergedPdf); – Lill Lansey Jun 22 '15 at 15:24
What if you want as download stream? return File(ms, "application/pdf", "file.pdf") is not working for me – 10K35H 5H4KY4 Jul 21 '15 at 06:43
@NearlyCrazy That essentially constututes a new question, *How to return the contents of a MemoryStream as a download stream*; please make that a question in its own right. I'm afraid I cannot help here as I'm not into .Net web service stuff. – mkl Jul 21 '15 at 08:32
i am getting an error "An item with the same key has already been added." at below line " copy.AddPage(copy.GetImportedPage(reader, System.Threading.Interlocked.Increment(page)))". Can anyone help please] – Vikky Oct 09 '15 at 16:00
@Vikky that line is not in the code above. Thus, your question is about a different situation than explicitly explained here. To get help, therefore, you should make that an actual so question and supply the required information. – mkl Oct 09 '15 at 16:14
i converted this code to vb, i googled many times and not getting any ref. Please help if you can – Vikky Oct 09 '15 at 16:15
You use `System.Threading.Interlocked.Increment(page)`. This seems to indicate that you access and change that `page` variable from multiple threads, and therefore the copying classes, too. The classes above are not inherently thread-safe. So you have to take special care. Thus, please describe your issue in detail in a question in its own right. – mkl Oct 09 '15 at 16:28
hi @mkl, i have posted in detail here http://stackoverflow.com/questions/33043151/vb-net-merge-multiple-pdfs-into-one-and-export – Vikky Oct 09 '15 at 16:33
You are missing `document.Close();` Without this, you may see errors when opening the merged pdf file, like "_There was an error opening this document. The file is damaged and could not be repaired._". For my purposes, I needed to return a Stream object, so I declared this variable above the Usings: `Stream stream = null;` then near the end of my Usings (immediately after calling `document.Close();`), I added this `stream = new MemoryStream(ms.ToArray());` – MikeTeeVee Jul 25 '16 at 15:05
3

@MikeTeeVee *You are missing `document.Close()`* - No, `using (Document document = new Document())` implicitly closes the document. If you need to grab the memory stream contents before the closing bracket of the `using`, though, you indeed need to explicitly close. – mkl Jul 25 '16 at 16:03

score 9 · Answer 2 · answered Mar 02 '14 at 02:45

I used iTextsharp with c# to combine pdf files. This is the code I used.

string[] lstFiles=new string[3];
    lstFiles[0]=@"C:/pdf/1.pdf";
    lstFiles[1]=@"C:/pdf/2.pdf";
    lstFiles[2]=@"C:/pdf/3.pdf";

    PdfReader reader = null;
    Document sourceDocument = null;
    PdfCopy pdfCopyProvider = null;
    PdfImportedPage importedPage;
    string outputPdfPath=@"C:/pdf/new.pdf";


    sourceDocument = new Document();
    pdfCopyProvider = new PdfCopy(sourceDocument, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));

    //Open the output file
    sourceDocument.Open();

    try
    {
        //Loop through the files list
        for (int f = 0; f < lstFiles.Length-1; f++)
        {
            int pages =get_pageCcount(lstFiles[f]);

            reader = new PdfReader(lstFiles[f]);
            //Add pages of current file
            for (int i = 1; i <= pages; i++)
            {
                importedPage = pdfCopyProvider.GetImportedPage(reader, i);
                pdfCopyProvider.AddPage(importedPage);
            }

            reader.Close();
         }
        //At the end save the output file
        sourceDocument.Close();
    }
    catch (Exception ex)
    {
        throw ex;
    }


private int get_pageCcount(string file)
{
    using (StreamReader sr = new StreamReader(File.OpenRead(file)))
    {
        Regex regex = new Regex(@"/Type\s*/Page[^s]");
        MatchCollection matches = regex.Matches(sr.ReadToEnd());

        return matches.Count;
    }
}

score 5 · Answer 3 · answered Apr 10 '13 at 19:08

Here is some code I pulled out of an old project I had. It was a web application but I was using iTextSharp to merge pdf files then print them.

public static class PdfMerger
    {
        /// <summary>
        /// Merge pdf files.
        /// </summary>
        /// <param name="sourceFiles">PDF files being merged.</param>
        /// <returns></returns>
        public static byte[] MergeFiles(List<Stream> sourceFiles)
        {
            Document document = new Document();
            MemoryStream output = new MemoryStream();

            try
            {
                // Initialize pdf writer
                PdfWriter writer = PdfWriter.GetInstance(document, output);
                writer.PageEvent = new PdfPageEvents();

                // Open document to write
                document.Open();
                PdfContentByte content = writer.DirectContent;

                // Iterate through all pdf documents
                for (int fileCounter = 0; fileCounter < sourceFiles.Count; fileCounter++)
                {
                    // Create pdf reader
                    PdfReader reader = new PdfReader(sourceFiles[fileCounter]);
                    int numberOfPages = reader.NumberOfPages;

                    // Iterate through all pages
                    for (int currentPageIndex = 1; currentPageIndex <=
                                        numberOfPages; currentPageIndex++)
                    {
                        // Determine page size for the current page
                        document.SetPageSize(
                            reader.GetPageSizeWithRotation(currentPageIndex));

                        // Create page
                        document.NewPage();
                        PdfImportedPage importedPage =
                            writer.GetImportedPage(reader, currentPageIndex);


                        // Determine page orientation
                        int pageOrientation = reader.GetPageRotation(currentPageIndex);
                        if ((pageOrientation == 90) || (pageOrientation == 270))
                        {
                            content.AddTemplate(importedPage, 0, -1f, 1f, 0, 0,
                                reader.GetPageSizeWithRotation(currentPageIndex).Height);
                        }
                        else
                        {
                            content.AddTemplate(importedPage, 1f, 0, 0, 1f, 0, 0);
                        }
                    }
                }
            }
            catch (Exception exception)
            {
                throw new Exception("There has an unexpected exception" +
                        " occured during the pdf merging process.", exception);
            }
            finally
            {
                document.Close();
            }
            return output.GetBuffer();
        }
    }



    /// <summary>
    /// Implements custom page events.
    /// </summary>
    internal class PdfPageEvents : IPdfPageEvent
    {
        #region members
        private BaseFont _baseFont = null;
        private PdfContentByte _content;
        #endregion

        #region IPdfPageEvent Members
        public void OnOpenDocument(PdfWriter writer, Document document)
        {
            _baseFont = BaseFont.CreateFont(BaseFont.HELVETICA,
                                BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
            _content = writer.DirectContent;
        }

        public void OnStartPage(PdfWriter writer, Document document)
        { }

        public void OnEndPage(PdfWriter writer, Document document)
        { }

        public void OnCloseDocument(PdfWriter writer, Document document)
        { }

        public void OnParagraph(PdfWriter writer,
                    Document document, float paragraphPosition)
        { }

        public void OnParagraphEnd(PdfWriter writer,
                    Document document, float paragraphPosition)
        { }

        public void OnChapter(PdfWriter writer, Document document,
                                float paragraphPosition, Paragraph title)
        { }

        public void OnChapterEnd(PdfWriter writer,
                    Document document, float paragraphPosition)
        { }

        public void OnSection(PdfWriter writer, Document document,
                    float paragraphPosition, int depth, Paragraph title)
        { }

        public void OnSectionEnd(PdfWriter writer,
                    Document document, float paragraphPosition)
        { }

        public void OnGenericTag(PdfWriter writer, Document document,
                                    Rectangle rect, string text)
        { }
        #endregion

        private float GetCenterTextPosition(string text, PdfWriter writer)
        {
            return writer.PageSize.Width / 2 - _baseFont.GetWidthPoint(text, 8) / 2;
        }
    }

I didn't write this, but made some modifications. I can't remember where I found it. After I merged the PDFs I would call this method to insert javascript to open the print dialog when the PDF is opened. If you change bSilent to true then it should print silently to their default printer.

public Stream addPrintJStoPDF(Stream thePDF)
{
    MemoryStream outPutStream = null;
    PRStream finalStream = null;
    PdfDictionary page = null;
    string content = null;

    //Open the stream with iTextSharp
    var reader = new PdfReader(thePDF);

    outPutStream = new MemoryStream(finalStream.GetBytes());
    var stamper = new PdfStamper(reader, (MemoryStream)outPutStream);
    var jsText = "var res = app.setTimeOut('this.print({bUI: true, bSilent: false, bShrinkToFit: false});', 200);";
    //Add the javascript to the PDF
    stamper.JavaScript = jsText;

    stamper.FormFlattening = true;
    stamper.Writer.CloseStream = false;
    stamper.Close();

    //Set the stream to the beginning
    outPutStream.Position = 0;

    return outPutStream;
}

Not sure how well the above code is written since I pulled it from somewhere else and I haven't worked in depth at all with iTextSharp but I do know that it did work at merging PDFs that I was generating at runtime.

Please refrain from using this kind of merge routine unless you have very specific requirements forcing you to do that. When you use `PdfWriter` to merge source PDFs, interactive features (forms and other annotations) are lost. Furthermore the resulting PDF internally contains an unnecessary wrapper around the page information which when iterated multiple times may cause PDF viewers to fail when trying to display the PDF. — mkl, Apr 11 '13 at 08:41
As I said I had pulled this from older code that was in production but PDFs were generated from html built by a wysiwyg editor so we had no interactive features. Also our iterations were usually only around 10 at a time and we never had issues with the pdf not opening. I posted this as an example as we had it running in production and I know that it was working to merge pdfs with no reported issues. — DSlagle, Apr 11 '13 at 15:04
I intended no offense; such merging solutions like yours based on `PdfWriter` indeed can be found more often than the ones using the better suited classes when googling around, and they *do* work after a fashion, so they are not completely wrong. `Pdf*Copy*` based solutions, though, generally are easier to use (no need to adapt the target document page size and rotation again and again), more complete (concerning interactive features), and produce cleaner output (with respect to the internal PDF structure). — mkl, Apr 12 '13 at 10:14

score 5 · Answer 4 · answered Oct 06 '16 at 13:25

Tested with iTextSharp-LGPL 4.1.6:

    public static byte[] ConcatenatePdfs(IEnumerable<byte[]> documents)
    {
        using (var ms = new MemoryStream())
        {
            var outputDocument = new Document();
            var writer = new PdfCopy(outputDocument, ms);
            outputDocument.Open();

            foreach (var doc in documents)
            {
                var reader = new PdfReader(doc);
                for (var i = 1; i <= reader.NumberOfPages; i++)
                {
                    writer.AddPage(writer.GetImportedPage(reader, i));
                }
                writer.FreeReader(reader);
                reader.Close();
            }

            writer.Close();
            outputDocument.Close();
            var allPagesContent = ms.GetBuffer();
            ms.Flush();

            return allPagesContent;
        }
    }

score 3 · Answer 5 · edited May 23 '17 at 11:55

To avoid the memory issues mentioned, I used file stream instead of memory stream(mentioned in ITextSharp Out of memory exception merging multiple pdf) to merge pdf files:

        var parentDirectory = Directory.GetParent(SelectedDocuments[0].FilePath);
        var savePath = parentDirectory + "\\MergedDocument.pdf";

        using (var fs = new FileStream(savePath, FileMode.Create))
        {
            using (var document = new Document())
            {
                using (var pdfCopy = new PdfCopy(document, fs))
                {
                    document.Open();
                    for (var i = 0; i < SelectedDocuments.Count; i++)
                    {
                        using (var pdfReader = new PdfReader(SelectedDocuments[i].FilePath))
                        {
                            for (var page = 0; page < pdfReader.NumberOfPages;)
                            {
                                pdfCopy.AddPage(pdfCopy.GetImportedPage(pdfReader, ++page));
                            }
                        }
                    }
                }
            }
        }

score 0 · Answer 6 · answered Sep 27 '18 at 11:12

****/*For Multiple PDF Print..!!*/****

<button type="button" id="btnPrintMultiplePdf" runat="server" class="btn btn-primary btn-border btn-sm"
                                                                onserverclick="btnPrintMultiplePdf_click">
                                                                <i class="fa fa-file-pdf-o"></i>Print Multiple pdf</button>

protected void btnPrintMultiplePdf_click(object sender, EventArgs e)
    {
        if (ValidateForMultiplePDF() == true)
        {
            #region Declare Temp Variables..!!
            CheckBox chkList = new CheckBox();
            HiddenField HidNo = new HiddenField();

            string Multi_fofile, Multi_listfile;
            Multi_fofile = Multi_listfile = "";
            Multi_fofile = Server.MapPath("PDFRNew");
            #endregion

            for (int i = 0; i < grdRnew.Rows.Count; i++)
            {
                #region Find Grd Controls..!!
                CheckBox Chk_One = (CheckBox)grdRnew.Rows[i].FindControl("chkOne");
                Label lbl_Year = (Label)grdRnew.Rows[i].FindControl("lblYear");
                Label lbl_No = (Label)grdRnew.Rows[i].FindControl("lblCode");
                #endregion

                if (Chk_One.Checked == true)
                {
                    HidNo .Value = llbl_No .Text.Trim()+ lbl_Year .Text;

                    if (File.Exists(Multi_fofile + "\\" + HidNo.Value.ToString() + ".pdf"))
                    {
                        #region Get Multiple Files Name And Paths..!!
                        if (Multi_listfile != "")
                        {
                            Multi_listfile = Multi_listfile + ",";
                        }
                        Multi_listfile = Multi_listfile + Multi_fofile + "\\" + HidNo.Value.ToString() + ".pdf";

                        #endregion
                    }
                }
            }

            #region For Generate Multiple Pdf..!!
            if (Multi_listfile != "")
            {
                String[] Multifiles = Multi_listfile.Split(',');
                string DestinationFile = Server.MapPath("PDFRNew") + "\\Multiple.Pdf";
                MergeFiles(DestinationFile, Multifiles);

                Response.ContentType = "pdf";
                Response.AddHeader("Content-Disposition", "attachment;filename=\"" + DestinationFile + "\"");
                Response.TransmitFile(DestinationFile);
                Response.End();
            }
            else
            {

            }
            #endregion
        }
    }

    private void MergeFiles(string DestinationFile, string[] SourceFiles)
    {
        try
        {
            int f = 0;
            /**we create a reader for a certain Document**/
            PdfReader reader = new PdfReader(SourceFiles[f]);
            /**we retrieve the total number of pages**/
            int n = reader.NumberOfPages;
            /**Console.WriteLine("There are " + n + " pages in the original file.")**/
            /**Step 1: creation of a document-object**/
            Document document = new Document(reader.GetPageSizeWithRotation(1));
            /**Step 2: we create a writer that listens to the Document**/
            PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(DestinationFile, FileMode.Create));
            /**Step 3: we open the Document**/
            document.Open();
            PdfContentByte cb = writer.DirectContent;
            PdfImportedPage page;
            int rotation;
            /**Step 4: We Add Content**/
            while (f < SourceFiles.Length)
            {
                int i = 0;
                while (i < n)
                {
                    i++;
                    document.SetPageSize(reader.GetPageSizeWithRotation(i));
                    document.NewPage();
                    page = writer.GetImportedPage(reader, i);
                    rotation = reader.GetPageRotation(i);
                    if (rotation == 90 || rotation == 270)
                    {
                        cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
                    }
                    else
                    {
                        cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
                    }
                    /**Console.WriteLine("Processed page " + i)**/
                }
                f++;
                if (f < SourceFiles.Length)
                {
                    reader = new PdfReader(SourceFiles[f]);
                    /**we retrieve the total number of pages**/
                    n = reader.NumberOfPages;
                    /**Console.WriteLine("There are"+n+"pages in the original file.")**/
                }
            }
            /**Step 5: we Close the Document**/
            document.Close();
        }
        catch (Exception e)
        {
            string strOb = e.Message;
        }
    }

    private bool ValidateForMultiplePDF()
    {
        bool chkList = false;

        foreach (GridViewRow gvr in grdRnew.Rows)
        {
            CheckBox Chk_One = (CheckBox)gvr.FindControl("ChkSelectOne");
            if (Chk_One.Checked == true)
            {
                chkList = true;
            }
        }
        if (chkList == false)
        {
            divStatusMsg.Style.Add("display", "");
            divStatusMsg.Attributes.Add("class", "alert alert-danger alert-dismissable");
            divStatusMsg.InnerText = "ERROR !!...Please Check At Least On CheckBox.";
            grdRnew.Focus();
            set_timeout();
            return false;
        }
        return true;
    }

It would be more helpful if you could explain how this answers the question. — Dragonthoughts, Sep 27 '18 at 11:33
Please refrain from using this kind of merge routine unless you have very specific requirements forcing you to do that. When you use `PdfWriter` to merge source PDFs, interactive features (forms and other annotations) are lost. Furthermore the resulting PDF internally contains an unnecessary wrapper around the page information which when iterated multiple times may cause PDF viewers to fail when trying to display the PDF. — mkl, Sep 27 '18 at 13:04

How to merge multiple pdf files (generated in run time)?

6 Answers6

Linked

Related