26

I am a PHP developer and in one of my projects, I need to convert some HTML documents (about 30 to 50 pages) into PDF documents.

My search has turned up the following possible solutions. Among them are some PHP libraries and some command line applications. Each has its own advantages and disadvantages.

PHP libraries:

  1. fpdf (need more effort to convert)
  2. tcpdf (need more effort to convert)
  3. html2fpdf http://html2fpdf.sourceforge.net
  4. html2pdf http://html2pdf.fr/
  5. dompdf http://code.google.com/p/dompdf/ (compared to other, works well)

For each library, I have problems like:

  1. Takes a long time (more than five minutes to convert 30 HTML pages)
  2. Requires too many resources (memory and time)

    (I set the following parameters in php.ini:

    max_execution_time = 600
    memory_limit = 250M

    but things still don't work.)

  3. Needs HTML pages to be well-formatted (e.g. no missing close tags)

All of these work when I try to convert simple HTML docs (five or fewer pages with little CSS)

Command line applications

All command line apps work perfectly and very quickly compared to the above libraries, but only when I run them directly on console. When I try to use them in PHP with exec() or system(), they give me errors.

The following are the command line applications and their errors when I run them in PHP:

  1. html2pdf (http://www.tufat.com/s_html2ps_html2pdf.htm)

    html2pdf:11380): Gtk-WARNING **: cannot open display: :0.0
    No protocol specified

  2. wkhtmltopdf

    Loading page: 10%
    Loading page: 33%
    Loading page: 100%
    Waiting for redirect
    Outputting pages
    QPainter::begin(): Returned false
    QPainter::begin(): Returned false
    QPainter::save: Painter not active
    QPainter::scale: Painter not active
    QPainter::setRenderHint: Painter must be active to set rendering hints
    QPainter::setBrush: Painter not active
    QPainter::pen: Painter not active
    QPainter::setPen: Painter not active

  3. htmltopdf (http://www.ultrashareware.com/html-to-pdf.htm)

So now I am looking for help. Can anyone answer:

Which PHP library would work well in my case?

Why do these errors occur in command line applications?

Community
  • 1
  • 1
Santosh S
  • 4,165
  • 6
  • 32
  • 37
  • The error "Gtk-WARNING **: cannot open display: :0.0" is because the app uses the windowing system. I would guess that the error occurs because the app tries to open the PDF after its generation? – rogeriopvl Sep 10 '09 at 07:36
  • no , it not open pdf after generation. But it open a small window while using it in console. – Santosh S Sep 10 '09 at 10:12
  • Because there are so many questions similar to this one but not quite the same, I decided to try to collect a complete list of HTML to PDF converters into a community wiki question http://stackoverflow.com/questions/3178448/list-of-html-to-pdf-converters – rjmunro Jul 05 '10 at 12:55
  • Off-topic on SO, but https://softwarerecs.stackexchange.com/q/45903/1834 – Martin Thoma Sep 21 '17 at 15:05

8 Answers8

8

Regarding wkhtmltopdf:

  • This thing works blazingly fast and it can also handle all kinds of HTML/CSS you throw at it, so when you need speed, you should seriosly consider it. We switch to it recently in our company and our PDF serving got enourmous speed-boost.

  • At least under Linux it needs XOrg libraries to be installed - servers usually don't have them, so that might be your problem.

Rene Saarsoo
  • 13,580
  • 8
  • 57
  • 85
  • It fails poorly with Multipage tables – andho Aug 21 '11 at 09:21
  • No it doesn't. You just have to handle this kind of problem with css : http://stackoverflow.com/questions/1763639/how-to-deal-with-page-breaks-when-printing-a-large-html-table – Carlos2W Mar 11 '16 at 21:33
3

Try this:

FDisk
  • 8,493
  • 2
  • 47
  • 52
2

Have you tried Prince?

1

There are many solution to convert HTML to PDF, I can suggest you the one by https://grabz.it.

The have a flexible PHP API which can be used by cronjobs or directly from PHP web page.

If you want to try it, at first you should get an app key + secret for authorization and the development free SDK

Here is an example of a basic implementation.

//First init
include("GrabzItClient.class.php");

// Create the GrabzItClient class
// Replace "APPLICATION KEY", "APPLICATION SECRET" values for your account!
$grabzIt = new GrabzItClient("Application Key", "Application Secret");

// To take a PDF screenshot
$grabzIt->URLToPDF("http://www.google.com");

// To save in case public callback handler is available
$grabzIt->Save("http://www.example.com/handler.php");   
// OR To save in case public callback handler is not available,
// it's a synchonous method can be usedthe will force your application to wait 
// while the screenshot is created
$filepath = "images/result.jpg";
$grabzIt->SaveTo($filepath);    

It's possible to get other kinds of screenshots such as image screenshot and etc.

Johnny
  • 14,397
  • 15
  • 77
  • 118
1

Try HTMLDOC commandline tool project https://www.msweet.org/htmldoc/index.html

xmedeko
  • 7,336
  • 6
  • 55
  • 85
0

But what if You will use any online service and send Your HTML content over HTTP? Of course most of them are not free.

  • can you suggests any on-line service and their URL/link – Santosh S Sep 10 '09 at 07:44
  • http://www.freepdfconvert.com is free, as the name suggests. Automating its use may not be the easiest thing to do, on the other hand, but it can take either an uploaded file or a URL. – Julian Sep 10 '09 at 14:43
  • And this could take a while to create big set of PDF. –  Sep 11 '09 at 09:10
  • freepdfconvert.com has no support for flash files and javascript. And generated pdf is not exactly as the website looks. for example i tried pazintys.com – FDisk Sep 30 '09 at 06:33
0

One possibility: having the script automatically:

  1. Take the web page
  2. Open that page in a web browser
  3. Take a screencap of that page
  4. Turn it into a PDF

step 4 is easy - there are plenty of PHP/cmdline libraries that will let you put images onto a pdf or convert them (eg, fpdf.)

For steps 1-3... you might could try looking at the code from here: http://browsershots.org/. Not sure if it would be relevant - it seems like it requires a lot of setup. Maybe their architecture could work?

poundifdef
  • 18,726
  • 23
  • 95
  • 134
  • but, what about links or anchor tag in html pages? – Santosh S Sep 18 '09 at 05:46
  • 3
    That is a terrible solution. It will turn all the text into bitmap graphics. It will use screen css instead of print css. It will only show as much of the page as can fit in a screen capture. There are plenty of ways to do it better. Please don't do this! – rjmunro Jul 05 '10 at 10:14
0

A couple of questions and suggestions:

  • Do you really need it converted to PDF? Why? In some cases, it would be better to stick with HTML.
  • Is upgrading the hardware of the server that generates the PDF an option? I asked this because if all the libraries that you've tried is taking too long to create, then your only option might be upgrading the server.
  • You might want to solve the problem with the command line error. If it gives the fastest results, then find a work around it.
Randell
  • 6,112
  • 6
  • 45
  • 70
  • do you know any command line app for same apart from those mentioned in question? – Santosh S Sep 18 '09 at 05:47
  • For PHP, i've only used dompdf, and I only have to print an average of 3 pages per call. The only other PDF generator i've used is JasperReports, but I think it's only for Java. Maybe you could most the entire stack trace of the error you're getting from the command line. – Randell Sep 18 '09 at 13:04