9

I am looking for third party .dll which can support merging pdf's into one and also converting the merged pdf into one .PNG image file.

I know Ghostscript or pdfsharp supports .NET framework but not .NET core 2.0 framework.

If anyone can please suggest any third part party dll which can merge all the PDFs and also convert merged pdf into PNG image in .NET core 2.0.

Any help or suggestions to achieve this requirement?

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
Roshan
  • 139
  • 1
  • 4
  • 9
  • Possible duplicate of [itext7 pdf to image](https://stackoverflow.com/questions/37809019/itext7-pdf-to-image) – Simply Ged Mar 26 '18 at 14:24
  • Your question is off-topic. However, just because a library is not specifically built for .NET Core, does not mean you cannot potentially still use it. I'd recommend trying to reference these libraries, first, and see how far you can get. Everything may work just fine. – Chris Pratt Mar 26 '18 at 14:24
  • https://stackoverflow.com/q/37809019/2309376 shows how to convert to an image. You can also use the same library iText7 to merge PDF's – Simply Ged Mar 26 '18 at 14:24
  • itext 7 does not solve my problem since itext 7 does not support converting merged pdf into PNG image and I guess itext 7 does not support .NET core 2.0 – Roshan Mar 26 '18 at 14:28
  • Chris Pratt Thank you for the response :) ...I tried to install ghostscript and reference those libraries getting error at the runtime :( – Roshan Mar 26 '18 at 14:41

5 Answers5

9

I'm just answering the part about rendering a PDF and converting it to an image in .NET Core 3.1, which took a couple days to figure all out. I ended up using phuldr's Docnet.Core to get the image bytes and used Magick.NET-Q16-AnyCpu to save it to an image file.

There was a little extra work to re-arrange the image bytes to RGBA order and to turn the transparent pixels into a specific color (white in my case). Here's my code in case it helps:

public MemoryStream PdfToImage(byte[] pdfBytes /* the PDF file bytes */)
{
    MemoryStream memoryStream = new MemoryStream();
    MagickImage imgBackdrop;
    MagickColor backdropColor = MagickColors.White; // replace transparent pixels with this color 
    int pdfPageNum = 0; // first page is 0

    using (IDocLib pdfLibrary = DocLib.Instance)
    {
        using (var docReader = pdfLibrary.GetDocReader(pdfBytes, new PageDimensions(1.0d)))
        {
            using (var pageReader = docReader.GetPageReader(pdfPageNum))
            {
                var rawBytes = pageReader.GetImage(); // Returns image bytes as B-G-R-A ordered list.
                rawBytes = RearrangeBytesToRGBA(rawBytes);
                var width = pageReader.GetPageWidth();
                var height = pageReader.GetPageHeight();

                // specify that we are reading a byte array of colors in R-G-B-A order.
                PixelReadSettings pixelReadSettings = new PixelReadSettings(width, height, StorageType.Char, PixelMapping.RGBA);
                using (MagickImage imgPdfOverlay = new MagickImage(rawBytes, pixelReadSettings))
                {
                    // turn transparent pixels into backdrop color using composite: http://www.imagemagick.org/Usage/compose/#compose
                    imgBackdrop = new MagickImage(backdropColor, width, height);                            
                    imgBackdrop.Composite(imgPdfOverlay, CompositeOperator.Over);
                }
            }
        }
    }

    
    imgBackdrop.Write(memoryStream, MagickFormat.Png);
    imgBackdrop.Dispose();
    memoryStream.Position = 0;
    return memoryStream;
}

private byte[] RearrangeBytesToRGBA(byte[] BGRABytes)
{
    var max = BGRABytes.Length;
    var RGBABytes = new byte[max];
    var idx = 0;
    byte r;
    byte g;
    byte b;
    byte a;
    while (idx < max)
    {
        // get colors in original order: B G R A
        b = BGRABytes[idx];
        g = BGRABytes[idx + 1];
        r = BGRABytes[idx + 2];
        a = BGRABytes[idx + 3];

        // re-arrange to be in new order: R G B A
        RGBABytes[idx] = r;
        RGBABytes[idx + 1] = g;
        RGBABytes[idx + 2] = b;
        RGBABytes[idx + 3] = a;

        idx += 4;
    }
    return RGBABytes;
}
HappyGoLucky
  • 379
  • 3
  • 9
  • Hi! Did you get any errors on the pdfLibrary.GetDocReader(..)? I'm running on an exception: "unable to open the document" – UIChris Sep 24 '20 at 14:34
  • I don't remember running into problems on pdfLibrary.GetDocReader(). I read in the pdfBytes via a stream, then close the stream before calling GetDocReader. I'd guess it has to do with how you're creating pdfBytes... – HappyGoLucky Sep 25 '20 at 18:40
  • Hi @HappyGoLucky ! Do you remember how you concatenated the images? I see you have a variable named "pdfPageNum". Your option was the best one i tried, but i don't know how to concatenate the images :(. Right now the above code gets only the first page of the document. – SimpForJS Mar 16 '22 at 01:13
  • Hi @SimpForJS. I only tried this with first page, but extra pages should work. If you want to stack all pages on top of each other, or put them all side by side, resulting in 1 image, try: make `imgBackdrop` high enough or wide enough for all pages, loop through the single page code, and call `imgBackdrop.Composite` with the overload where you pass an x, y coordinate so each page image starts at appropriate location. Good luck! – HappyGoLucky Mar 18 '22 at 02:46
  • Very strange... I've been having issues reading in from BGRA format the entire day, stumbled across this answer and works like a charm. Thanks @HappyGoLucky – jarodsmk Apr 11 '22 at 13:05
7

I've been struggling with this myself lately, couldn't find a library that would suit my needs so I wrote a C# wrapper around PDFium which has BSD 3-clause license and my wrapper code is released under MIT so you can either use the NuGet package or use the code directly yourself. The repo can be found here docnet.

phuldr
  • 126
  • 2
  • 3
6

you can use iTextSharp.LGPLv2.Core to merge pdf files, it works pretty well. Please check this tutorial. It supports .NETStandard as well.

    using System;
    using System.Collections.Generic;
    using System.IO;
    using iTextSharp.text;
    using iTextSharp.text.pdf;

    namespace HelveticSolutions.PdfLibrary
    {
      public static class PdfMerger
      {
        /// <summary>
        /// Merge pdf files.
        /// </summary>
        /// <param name="sourceFiles">PDF files being merged.</param>
        /// <returns></returns>
        public static byte[] MergeFiles(List<byte[]> sourceFiles)
        {
          Document document = new Document();
          using (MemoryStream ms = new MemoryStream())
          {
            PdfCopy copy = new PdfCopy(document, ms);
            document.Open();
            int documentPageCounter = 0;

            // Iterate through all pdf documents
            for (int fileCounter = 0; fileCounter < sourceFiles.Count; fileCounter++)
            {
              // Create pdf reader
              PdfReader reader = new PdfReader(sourceFiles[fileCounter]);
              int numberOfPages = reader.NumberOfPages;

              // Iterate through all pages
              for (int currentPageIndex = 1; currentPageIndex <= numberOfPages; currentPageIndex++)
              {
                documentPageCounter++;
                PdfImportedPage importedPage = copy.GetImportedPage(reader, currentPageIndex);
                PdfCopy.PageStamp pageStamp = copy.CreatePageStamp(importedPage);

                // Write header
                ColumnText.ShowTextAligned(pageStamp.GetOverContent(), Element.ALIGN_CENTER,
                    new Phrase("PDF Merger by Helvetic Solutions"), importedPage.Width / 2, importedPage.Height - 30,
                    importedPage.Width < importedPage.Height ? 0 : 1);

                // Write footer
                ColumnText.ShowTextAligned(pageStamp.GetOverContent(), Element.ALIGN_CENTER,
                    new Phrase(String.Format("Page {0}", documentPageCounter)), importedPage.Width / 2, 30,
                    importedPage.Width < importedPage.Height ? 0 : 1);

                pageStamp.AlterContents();

                copy.AddPage(importedPage);
              }

              copy.FreeReader(reader);
              reader.Close();
            }

            document.Close();
            return ms.GetBuffer();
          }
        }
      }
    }
Dalton
  • 435
  • 6
  • 12
1

DynamicPDF Rasterizer (NuGet Pakage ID: ceTe.DynamicDPF.Rasterizer.NET) will convert PDFs to PNG and works on .NET Core. You can also use DynamicPDF Merger (NuGet Package ID: ceTe.DynamicPDF.CoreSuite.NET) to merge PDFs. Here is an example:

//Merging existing PDFs using DynamicPDF Merger for .NET product.
MergeDocument mergeDocument = new MergeDocument();
mergeDocument.Append(@"D:\temporary\DocumentB.pdf");
mergeDocument.Append(@"D:\temporary\DocumentC.pdf");
mergeDocument.Append(@"D:\temporary\DocumentD.pdf");
 
//Draw the merged output into byte array or save it to disk (by specifying the file path).
byte[] byteData = mergeDocument.Draw();
 
//Convert the merged PDF into PMG image format using DynamicPDF Rasterizer for .NET product.
InputPdf pdfData = new InputPdf(byteData);
PdfRasterizer rastObj = new PdfRasterizer(pdfData);
rastObj.Draw(@"C:\temp\MyImage.png", ImageFormat.Png, ImageSize.Dpi150);

More information on the output formats for Rasterizer can be found here:

http://docs.dynamicpdf.com/NET_Help_Library_19_08/DynamicPDFRasterizerProgrammingWithOutputImageFormat.html

More information on deploying DynamicPDF Merger and Rasterizer to .NET Core 2.0 can be found here:

http://docs.dynamicpdf.com/NET_Help_Library_19_08/DynamicPDFRasterizerProgrammingWithReferencingTheAssembly.html

http://docs.dynamicpdf.com/NET_Help_Library_19_08/Merger%20Referencing%20the%20Assembly%20and%20Deployment.html

DynamicPDF
  • 135
  • 5
1

Look at Docotic.Pdf library. This library supports .NET Core without any dependencies and unsafe code.

Docotic's PDF to image renderer does not depend on GDI+ (System.Drawing). That's important for reliable running of your code in ASP.NET context or on Linux.

Merge PDF documents:

public void MergeDocuments(string firstPath, string secondPath)
{
    using (var pdf = new PdfDocument(firstPath))
    {
        pdf.Append(secondPath); // or append stream or byte array

        pdf.ReplaceDuplicateObjects(); // useful when merged files contain common objects like fonts and images

        pdf.Save("merged.pdf");
    }
}

Convert PDF page to PNG image:

using (var pdf = new PdfDocument(@"merged.pdf"))
{
    PdfDrawOptions options = PdfDrawOptions.Create();
    options.Compression = ImageCompressionOptions.CreatePng();
    options.BackgroundColor = new PdfRgbColor(255, 255, 255);
    options.HorizontalResolution = 600;
    options.VerticalResolution = 600;

    pdf.Pages[0].Save("result.png", options);
}

More samples for PDF to image conversion

You mentioned conversion of the merged PDF document to a single PNG image. PNG does not support multi-frame images (more detail). So you can only do the following:

  1. Merge all PDF document pages to the single page
  2. Render this page as described above

Here is the sample for this case (merge 2 pages to one and save as PNG):

using (var other = new PdfDocument(@"merged.pdf"))
{
    using (var pdf = new PdfDocument())
    {
        PdfXObject firstXObject = pdf.CreateXObject(other.Pages[0]);
        PdfXObject secondXObject = pdf.CreateXObject(other.Pages[1]);

        PdfPage page = pdf.Pages[0];
        double halfOfPage = page.Width / 2;
        page.Canvas.DrawXObject(firstXObject, 0, 0, halfOfPage, 400, 0);
        page.Canvas.DrawXObject(secondXObject, halfOfPage, 0, halfOfPage, 400, 0);

        PdfDrawOptions options = PdfDrawOptions.Create();
        options.BackgroundColor = new PdfRgbColor(255, 255, 255);
        page.Save("result.png", options);
    }
}
Vitaliy Shibaev
  • 1,420
  • 10
  • 24