7

I have some HTML content (including formatting tags such as strong, images etc).In my Java code, I want to convert this HTML content into a PDF document without losing the HTML formatting.

Is there anyway to do it in Java (using iText or any other library)?

lealceldeiro
  • 14,342
  • 6
  • 49
  • 80
Veera
  • 32,532
  • 36
  • 98
  • 137
  • 1
    possible duplicate of [Using itext to convert HTML to PDF](http://stackoverflow.com/questions/235851/using-itext-to-convert-html-to-pdf) – dogbane Jan 17 '11 at 11:35

3 Answers3

8

I used ITextRenderer from the Flying Saucer project.

Here is a short, self-contained, working example. In my case I wanted to later stream the bytes into an email attachment.

So, in the example I write it to a file purely for the sake of demonstration for this question. This is Java 8.

import com.lowagie.text.DocumentException;
import org.apache.commons.io.FileUtils;
import org.xhtmlrenderer.pdf.ITextRenderer;

import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;

public class So4712641 {

  public static void main(String... args) throws DocumentException, IOException {
    FileUtils.writeByteArrayToFile(new File("So4712641.pdf"), toPdf("<b>You gotta walk and don't look back</b>"));
  }

  /**
   * Generate a PDF document
   * @param html HTML as a string
   * @return bytes of PDF document
   */
  private static byte[] toPdf(String html) throws DocumentException, IOException {
    final ITextRenderer renderer = new ITextRenderer();
    renderer.setDocumentFromString(html);
    renderer.layout();
    try (ByteArrayOutputStream fos = new ByteArrayOutputStream(html.length())) {
      renderer.createPDF(fos);
      return fos.toByteArray();
    }
  }
}

This gives me

enter image description here

For completeness, here are relevant pieces for my Maven pom.xml

<dependencies>
    <dependency>
        <groupId>org.xhtmlrenderer</groupId>
        <artifactId>flying-saucer-pdf</artifactId>
        <version>9.0.8</version>
    </dependency>
    <dependency>
        <groupId>commons-io</groupId>
        <artifactId>commons-io</artifactId>
        <version>2.4</version>
    </dependency>
</dependencies>
Kirby
  • 15,127
  • 10
  • 89
  • 104
  • 1
    You saved my day! Thank you for your contribution!! :) – shirkkan May 24 '16 at 11:07
  • @Kirby is there a licensing issue? I read on another thread regarding the "commercial usage" of iText – Muhammad Nayab Apr 02 '19 at 08:55
  • It seems that Flying Saucer supports both older iText v.2 (open source under LGPL) and iText v.5 (under AGPL that requires its users provide thier source code, otherwise commercial iText license can be used). There is a newer project based on Flying Saucer using PDFBox as PDF library, rather than iText. It's **[openhtmltopdf](https://github.com/danfickle/openhtmltopdf)** – Fenix Nov 18 '19 at 04:34
0

Converting HTML to PDF isn't exactly straightforward in general, but if you're in control of what goes into the HTML, you can try using an XSL-FO implementation, like Apache FOP.

It's not out-of-the-box as you'll have to write (or find) a stylesheet that defines the conversion rules, but on the upside it gives you much more control over output formatting, which is quite useful as what looks good on screen doesn't necessarily look good on paper.

biziclop
  • 48,926
  • 12
  • 77
  • 104
  • I've heard terrible things about FOP ;) But XSL scares people anyway. – Jules Jan 17 '11 at 11:43
  • 2
    FOP isn't great (understatement of the year:)), but it all depends on what you want to use it for. If it's just some simple one or two page document for your users to download, FOP is okay. If you want to produce print quality, don't even consider it, you're better off buying a XEP license. – biziclop Jan 17 '11 at 11:47
0

I would try DocRaptor.com. It converts html to pdf or html to xls in any language, and since it uses Prince XML (without making you pay the expensive license fee), the quality is a lot better than the other options out there. It's also a web app, so there's nothing to download. Easy way to get around long, frustrating coding.

Here are some examples: https://docraptor.com/documentation#coding_examples

illbzo1
  • 480
  • 3
  • 13
Nate365
  • 217
  • 2
  • 3