Generate one pdf document with multiple pages converting from html using IText 7

Question

I'm working with IText 7, I've been able to get one html page and generate a pdf for that page, but I need to generate one pdf document from multiple html pages and separated by pages. For example: I have Page1.html, Page2.html and Page3.html. I will need a pdf document with 3 pages, the first page with the content of Page1.html, second page with the content of Page2.html and like that...

This is the code I have and it's working for one html page:

ConverterProperties properties = new ConverterProperties();              
PdfWriter writer = new PdfWriter(pdfRoot, new WriterProperties().SetFullCompressionMode(true));
PdfDocument pdfDocument = new PdfDocument(writer);
pdfDocument.AddEventHandler(PdfDocumentEvent.END_PAGE, new HeaderPdfEventHandler());
HtmlConverter.ConvertToPdf(htmlContent, pdfDocument, properties);

Is it possible to loop against the multiple html pages, add a new page to the PdfDocument for every html page and then have only one pdf generated with one page per html page?

UPDATE

I've been following this example and trying to translate it from Java to C#, I'm trying to use PdfMerger and loop around the html pages... but I'm receiving the Exception Cannot access a closed stream, on this line:

temp = new PdfDocument(
                    new PdfReader(new RandomAccessSourceFactory().CreateSource(baos), rp));

It looks like is related to the ByteArrayOutputStream baos instance. Any suggestions? This is my current code:

foreach (var html in htmlList)
{
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    PdfDocument temp = new PdfDocument(new PdfWriter(baos));
    HtmlConverter.ConvertToPdf(html, temp, properties);              
    ReaderProperties rp = new ReaderProperties();
    temp = new PdfDocument(
        new PdfReader(new RandomAccessSourceFactory().CreateSource(baos), rp));
    merger.Merge(temp, 1, temp.GetNumberOfPages());
    temp.Close();
}
pdfDocument.Close();

*"Is it possible to loop against the multiple html pages"* - have you tried to create that loop? In which way did it fail? (Because indeed, it should be possible.) — mkl, Aug 08 '19 at 16:24
The best solution is to generate those documents in memory and use `PdfMerger` to merge them into a single fat file — Alexey Subach, Aug 08 '19 at 17:23
@mkl I've updated my question, I'm trying to do it that way but I'm receiving an Exception — AlexGH, Aug 09 '19 at 13:58
@AlexeySubach Thanks for your suggestion, I'm trying to use `PdfMerger` but still hasn't been able to make it work, I've updated my question, any suggestion? — AlexGH, Aug 09 '19 at 13:59

score 4 · Accepted Answer · answered Aug 09 '19 at 18:05

You are using RandomAccessSourceFactory and passing there a closed stream which you wrote a PDF document into. RandomAccessSourceFactory expects an input stream instead that is ready to be read.

First of all you should use MemoryStream which is native to .NET world. ByteArrayOutputStream is the class that was ported from Java for internal purposes (although it extends MemoryStream as well). Secondly, you don't have to use RandomAccessSourceFactory - there is a simpler way.

You can create a new MemoryStream instance from the bytes of the MemoryStream that you used to create a temporary PDF with the following line:

baos = new MemoryStream(baos.ToArray());

As an additional remark, it's better to close PdfMerger instance directly instead of closing the document - closing PdfMerger closes the underlying document as well.

All in all, we get the following code that works:

foreach (var html in htmlList)
{
    MemoryStream baos = new MemoryStream();
    PdfDocument temp = new PdfDocument(new PdfWriter(baos));
    HtmlConverter.ConvertToPdf(html, temp, properties);              
    ReaderProperties rp = new ReaderProperties();
    baos = new MemoryStream(baos.ToArray());
    temp = new PdfDocument(new PdfReader(baos, rp));
    pdfMerger.Merge(temp, 1, temp.GetNumberOfPages());
    temp.Close();
}
pdfMerger.Close();

I wish I could upvote this answer 5 more times, thanks!!! Been struggling for a while with this... — AlexGH, Aug 09 '19 at 18:12

Fokiruna · Answer 2 · 2020-08-07T15:35:58.463

Maybe not so succinctly. I use "using". Similar answer

private byte[] CreatePDF(string html)
    {
        byte[] binData;

        using (var workStream = new MemoryStream())
        {
            using (var pdfWriter = new PdfWriter(workStream))
            {
                //Create one pdf document
                using (var pdfDoc = new PdfDocument(pdfWriter))
                {                        
                    pdfDoc.SetDefaultPageSize(iText.Kernel.Geom.PageSize.A4.Rotate());
                    //Create one pdf merger
                    var pdfMerger = new PdfMerger(pdfDoc);
                    //Create two identical pdfs
                    for (int i = 0; i < 2; i++)
                    {
                        using (var newStream = new MemoryStream(CreateDocument(html)))
                        {
                            ReaderProperties rp = new ReaderProperties();
                            using (var newPdf = new PdfDocument(new PdfReader(newStream, rp)))
                            {
                                pdfMerger.Merge(newPdf, 1, newPdf.GetNumberOfPages());
                            }
                        }
                    }
                }
                binData = workStream.ToArray();
            }
        }
        return binData;
    }

Create pdf

private byte[] CreateDocument(string html)
    {
        byte[] binData;

        using (var workStream = new MemoryStream())
        {
            using (var pdfWriter = new PdfWriter(workStream))
            {
                using (var pdfDoc = new PdfDocument(pdfWriter))
                {
                    pdfDoc.SetDefaultPageSize(iText.Kernel.Geom.PageSize.A4.Rotate());

                    ConverterProperties props = new ConverterProperties();
                    using (var document = HtmlConverter.ConvertToDocument(html, pdfDoc, props))
                    {                            
                    }
                }
                binData = workStream.ToArray();
            }
        }
        return binData;
    }

Generate one pdf document with multiple pages converting from html using IText 7

2 Answers2