38

After 10 hours and trying 4 other HTML to PDF tools I'm about ready to explode.

wkhtmltopdf sounds like an excellent solution...the problem is that I can't execute a process with enough permissions from asp.net so...

Process.Start("wkhtmltopdf.exe","http://www.google.com google.pdf");

starts but doesn't do anything.

Is there an easy way to either:

-a) allow asp.net to start processes (that can actually do something) or
-b) compile/wrap/whatever wkhtmltopdf.exe into somthing I can use from C# like this: WkHtmlToPdf.Save("http://www.google.com", "google.pdf");

David Murdoch
  • 87,823
  • 39
  • 148
  • 191

5 Answers5

26

You could also use Pechkin

.NET Wrapper for WkHtmlToPdf DLL, library that uses Webkit engine to convert HTML pages to PDF.

Nuget packages:

Pechkin.Synchronized

Pechkin

Răzvan Flavius Panda
  • 21,730
  • 17
  • 111
  • 169
  • Marking this as accepted after all these years because I just had to use wkhtmltopdf again on a new project and Pechkin worked perfectly! – David Murdoch Apr 25 '14 at 15:27
  • I am having problem with pechkin or Codaxy or even WkhtmlXSharp, They all do not display Thai fonts (Utf-8) or unicode fonts properly. While using itextsharp and the exe doesn't give that problem – WickStargazer Jul 24 '14 at 08:13
  • @DavidMurdoch are you having any file locking errors when deploying new code? if i understand it correctly Pechkin uses a native dll (WkHtmlToPdf) and that native dll might not get unloaded as your managed dlls when you upload new files? – Peter Dec 02 '14 at 15:49
23

I just started a new project to provide a C# P/Invoke wrapper around wkhtmltopdf.

You can checkout my code at: https://github.com/pruiz/WkHtmlToXSharp

Greets.

Pablo Ruiz García
  • 573
  • 1
  • 5
  • 9
  • 1
    wow, that looks pretty slick. Thanks for sharing! – David Murdoch Jan 30 '11 at 06:13
  • 3
    I get errors when I use MultiplexingConverter such "Attempted to read or write protected memory. This is often an indication that other memory is corrupt." Is there a way to prevent this error? – Brennan May 10 '11 at 19:52
  • 1
    I don't get my images embedded in the pdf. Nothing shows in where image should display. I checked if it occurred due to relative url of the image and converted it into absolute ones. Still no success. What would work? – Sangam Uprety Jun 08 '11 at 08:11
  • I'm having the same problem as Brennan. It works fine the first few tries but then crashes and I have to restart the webserver to get it working again. Any suggestions anyone? – Daniel P Sep 19 '11 at 12:11
  • They are working on a workaround ... http://code.google.com/p/wkhtmltopdf/issues/detail?id=511&start=100 – BeardinaSuit Nov 01 '11 at 00:50
  • I'm also facing the problem - Attempted to read or write protected memory. This is often an indication that other memory is corrupt. - Has this been fixed? Really need help! – Żubrówka Apr 13 '12 at 15:51
  • I really like this solution, but ... nearly 44MB after compiling? Any ideas on decreasing the size? – 321X Aug 06 '12 at 22:27
  • @Żubrówka I have the same problem. Has anyone figured out how to fix this issue? – crush Jun 11 '15 at 13:06
  • @crush i ended up just writing a simple wrapper to wkhtmltopdf.exe. It works pretty well and the solution is stable. Depending on your scenario, it might work quite well for you also :) – Żubrówka Jul 03 '15 at 07:27
18

Thanks to Paul, I have found the good wrapper written by Codaxy, which can also be easily downloaded via NuGet.

After a few trials, I have managed this MVC action, that instantly creates and returns the PDF file as a stream:

public ActionResult Pdf(string url, string filename)
{
    MemoryStream memory = new MemoryStream();
    PdfDocument document = new PdfDocument() { Url = url };
    PdfOutput output = new PdfOutput() { OutputStream = memory };

    PdfConvert.ConvertHtmlToPdf(document, output);
    memory.Position = 0;

    return File(memory, "application/pdf", Server.UrlEncode(filename));
}

Here, the Pdf* classes have been implemented in the wrapper, with a nice, clean code, unfortunately lacking documentation.

Within the converter, the URL will be converted to a PDF, stored in a temporary file, copied to the stream that we have given as parameter, and afterwards the PDF file is deleted.

Finally, we have to push the stream as FileStreamResult.

Do not forget to set the output stream's Position to zero, otherwise you will see PDF files being downloaded as zero bytes of size.

Community
  • 1
  • 1
Bolt Thunder
  • 745
  • 6
  • 26
  • To output PDF stream via MVC ActionResult on web browser directly, @endy-tjahjono demonstrated a good approach via [here](http://stackoverflow.com/questions/6168846/open-pdf-result-in-browser-tab-with-mvc-3) – Ken Pega Oct 11 '12 at 01:39
  • It's working nicely but hanging when I try to use a header (which works normally by using the command line tool directly). – marquito Sep 12 '13 at 17:53
  • @marquito: Do you mean "header" tag in HTML5? I do not have any experience, but did you try replacing it with a good old friend "div"? – Bolt Thunder Sep 15 '13 at 21:34
  • @BoltThunder I mean "header" as in using the --header-html modifier. When the path is wrong and it can't retrieve a header, it just hangs (intead of inserting a blank header, for instance) – marquito Sep 16 '13 at 03:04
  • The line "return File(memory, "application/pdf", Server.UrlEncode(filename));" is giving me "System.IO.File is a type but it is being used like a variable." – Hugh Seagraves Mar 29 '16 at 23:26
  • @HughSeagraves: here File(...) stands for "File" method in the Controller class, not the System.IO.File class. – Bolt Thunder Mar 31 '16 at 22:45
  • how can we make all margins=zero using this library – alamnaryab Aug 19 '21 at 11:10
4

Here is the actual code I used. Please feel free to edit this to get rid of some of the smells and other terribleness...I know its not that great.

using System;
using System.Diagnostics;
using System.IO;
using System.Web;
using System.Web.UI;

public partial class utilities_getPDF : Page
{
    protected void Page_Load(Object sender, EventArgs e)
    {
        string fileName = WKHtmlToPdf(myURL);

        if (!string.IsNullOrEmpty(fileName))
        {
            string file = Server.MapPath("~\\utilities\\GeneratedPDFs\\" + fileName);
            if (File.Exists(file))
            {
                var openFile = File.OpenRead(file);
                // copy the stream (thanks to http://stackoverflow.com/questions/230128/best-way-to-copy-between-two-stream-instances-c)
                byte[] buffer = new byte[32768];
                while (true)
                {
                    int read = openFile.Read(buffer, 0, buffer.Length);
                    if (read <= 0)
                    {
                        break;
                    }
                    Response.OutputStream.Write(buffer, 0, read);
                }
                openFile.Close();
                openFile.Dispose();

                File.Delete(file);
            }
        }
    }

    public string WKHtmlToPdf(string Url)
    {
        var p = new Process();

        string switches = "";
        switches += "--print-media-type ";
        switches += "--margin-top 10mm --margin-bottom 10mm --margin-right 10mm --margin-left 10mm ";
        switches += "--page-size Letter ";
        // waits for a javascript redirect it there is one
        switches += "--redirect-delay 100";

        // Utils.GenerateGloballyUniuqueFileName takes the extension from
        // basically returns a filename and prepends a GUID to it (and checks for some other stuff too)
        string fileName = Utils.GenerateGloballyUniqueFileName("pdf.pdf");

        var startInfo = new ProcessStartInfo
                        {
                            FileName = Server.MapPath("~\\utilities\\PDF\\wkhtmltopdf.exe"),
                            Arguments = switches + " " + Url + " \"" +
                                        "../GeneratedPDFs/" + fileName
                                        + "\"",
                            UseShellExecute = false, // needs to be false in order to redirect output
                            RedirectStandardOutput = true,
                            RedirectStandardError = true,
                            RedirectStandardInput = true, // redirect all 3, as it should be all 3 or none
                            WorkingDirectory = Server.MapPath("~\\utilities\\PDF")
                        };
        p.StartInfo = startInfo;
        p.Start();

        // doesn't work correctly...
        // read the output here...
        // string output = p.StandardOutput.ReadToEnd();

        //  wait n milliseconds for exit (as after exit, it can't read the output)
        p.WaitForExit(60000);

        // read the exit code, close process
        int returnCode = p.ExitCode;
        p.Close();

        // if 0, it worked
        return (returnCode == 0) ? fileName : null;
    }
}
David Murdoch
  • 87,823
  • 39
  • 148
  • 191
  • 3
    It isn't "if 0 or 2, it worked". It only works when 0. Other values: 1: (or 8?) Generic failure code value of EXIT_ERROR. 2: Error 404, not found (and empty PDF). 3: Error 401, unauthorized. As per unix specification, any value other than a 0 returned by a process signals some sort of error. This is true for windows programs as well. – Christian May 18 '10 at 19:23
  • thanks, +1 for the correction. The code (and comments) are originally from http://stackoverflow.com/questions/1331926/asp-net-calling-exe/1698839#1698839 – David Murdoch May 18 '10 at 19:32
  • I've tried both 0.99 & the newer RC and neither one of them seem to support --redirect-delay - any ideas? What version were you using? It's driving me nuts! It works perfectly aside from not waiting for my ajax calls to laod. – Ian Robinson Dec 22 '10 at 22:06
  • I'm not too sure if the version I used actually did anything with the redirect-delay switch (I copied the code from someone else). If wkhtmltopdf is able to run javascript/XHR make sure there are no JS errors. Also, try to make sure the code runs *before* `window.onload`. – David Murdoch Dec 23 '10 at 20:41
  • And lastly you could try doing something like this: (Have a look at: http://code.google.com/p/wkhtmltopdf/issues/detail?id=315) ` --run-script (function(){var d = new Date()+10000;while(new Date() < d){};}())` in order to force wkhtmltopdf to wait 10 seconds before capturing the page (as long as it doesn't have a long-running-script timeout). – David Murdoch Dec 23 '10 at 20:43
0

I can't comment so I post this as an 'answer' to the comments of above answer How to use wkhtmltopdf.exe in ASP.net

If --redirect-delay doesn't work, try --javascript-delay See here for all the options: https://github.com/antialize/wkhtmltopdf/blob/master/README_WKHTMLTOPDF

Or do wkhtmltopdf -H for extended help (afaik same output as above link).

Community
  • 1
  • 1
alwin
  • 35
  • 7