40

Does anyone know if it is possible to convert a HTML page (url) to a PDF using iText?

If the answer is 'no' than that is OK as well since I will stop wasting my time trying to work it out and just spend some money on one of a number of components which I know can :)

cottontail
  • 10,268
  • 18
  • 50
  • 51
Mark
  • 1,516
  • 2
  • 14
  • 24
  • 5
    UPDATE: iText does convert HTML to PDF, but it's stylesheet support is spotty. 5.0.6 was released in Feb of 2011, and included an overhaul of the related code with little visible behavior change. The next release is slated to include significant improvements in the HTML->PDF functionality. – Mark Storer Feb 07 '11 at 17:47
  • 1
    indeed check: [xmlworker](https://sourceforge.net/projects/xmlworker/) an addition to iText, it supports more CSS. – Redlab May 13 '11 at 00:58
  • UPDATE: Found this newer thread which summarises really well the current options http://stackoverflow.com/questions/4055838/best-commercial-html-to-pdf-c-component – Mark Aug 10 '11 at 04:39
  • 6
    Yet another update: [wkhtmltopdf] http://code.google.com/p/wkhtmltopdf/) uses the webkit rendering engine to layout the (virtual) screen, then itext to convert it to a PDF – peteorpeter Aug 30 '11 at 18:58

7 Answers7

30

I think this is exactly what you were looking for

http://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html

http://code.google.com/p/flying-saucer

Flying Saucer's primary purpose is to render spec-compliant XHTML and CSS 2.1 to the screen as a Swing component. Though it was originally intended for embedding markup into desktop applications (things like the iTunes Music Store), Flying Saucer has been extended work with iText as well. This makes it very easy to render XHTML to PDFs, as well as to images and to the screen. Flying Saucer requires Java 1.4 or higher.

Per Henrik Lausten
  • 21,331
  • 3
  • 29
  • 76
opensas
  • 60,462
  • 79
  • 252
  • 386
7

I have ended up using ABCPdf from webSupergoo. It works really well and for about $350 it has saved me hours and hours based on your comments above.

cottontail
  • 10,268
  • 18
  • 50
  • 51
Mark
  • 1,516
  • 2
  • 14
  • 24
4

The easiest way of doing this is using pdfHTML. It's an iText7 add-on that converts HTML5 (+CSS3) into pdf syntax.

The code is pretty straightforward:

    HtmlConverter.convertToPdf(
        "<b>This text should be written in bold.</b>",       // html to be converted
        new PdfWriter(
            new File("C://users/mark/documents/output.pdf")  // destination file
        )
    );

To learn more, go to http://itextpdf.com/itext7/pdfHTML

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
2

Use itext libray:

Here is the sample code. It is working perfectly fine:

String htmlFilePath = filePath + ".html";
String pdfFilePath = filePath + ".pdf";

// create an html file on given file path
Writer unicodeFileWriter = new OutputStreamWriter(new FileOutputStream(htmlFilePath), "UTF-8");
unicodeFileWriter.write(document.toString());
unicodeFileWriter.close();

ConverterProperties properties = new ConverterProperties();
properties.setCharset("UTF-8");
if (url.contains(".kr") || url.contains(".tw") || url.contains(".cn") || url.contains(".jp")) {
    properties.setFontProvider(new DefaultFontProvider(false, false, true));
}

// convert the html file to pdf file.
HtmlConverter.convertToPdf(new File(htmlFilePath), new File(pdfFilePath), properties);

Maven dependencies

<dependency>
    <groupId>com.itextpdf</groupId>
    <artifactId>itext7-core</artifactId>
    <version>7.1.6</version>
    <type>pom</type>
</dependency>

<dependency>
    <groupId>com.itextpdf</groupId>
    <artifactId>html2pdf</artifactId>
    <version>2.1.3</version>
</dependency>
cottontail
  • 10,268
  • 18
  • 50
  • 51
Asad Rao
  • 3,190
  • 1
  • 22
  • 26
1

The answer to your question is actually two-fold. First of all you need to specify what you intend to do with the rendered HTML: save it to a new PDF file, or use it within another rendering context (i.e. add it to some other document you are generating).

The former is relatively easily accomplished using the Flying Saucer framework, which can be found here: https://github.com/flyingsaucerproject/flyingsaucer

The latter is actually a much more comprehensive problem that needs to be categorized further. Using iText you won't be able to (trivially, at least) combine iText elements (i.e. Paragraph, Phrase, Chunk and so on) with the generated HTML. You can hack your way out of this by using the ContentByte's addTemplate method and generating the HTML to this template.

If you on the other hand want to stamp the generated HTML with something like watermarks, dates or the like, you can do this using iText.

So bottom line: You can't trivially integrate the rendered HTML in other pdf generating contexts, but you can render HTML directly to a blank PDF document.

Community
  • 1
  • 1
Jes
  • 2,748
  • 18
  • 22
  • with iText pdfHTML, there is actually a method `renderElements` which does exactly what you claim is impossible. It renders HTML syntax to iText element blocks like Paragraph, Table, etc. – Joris Schellekens Jan 12 '18 at 14:34
-1

Use iText's HTMLWorker

Example

user979051
  • 1,257
  • 2
  • 19
  • 35
-2

When I needed HTML to PDF conversion earlier this year, I tried the trial of Winnovative HTML to PDF converter (I think ExpertPDF is the same product, too). It worked great so we bought a license at that company. I don't go into it too in depth after that.

Mark Melville
  • 783
  • 8
  • 13