13

I have to offer some export functions to my website such as CSV or PDF. Is there a powerful and free tool for Java to convert HTML pages to PDF format?

nimrod
  • 5,595
  • 29
  • 85
  • 149
  • 1
    Possible duplicate: http://stackoverflow.com/questions/633780/converting-html-files-to-pdf – David May 08 '12 at 06:46

2 Answers2

31

Using Flying Saucer API with iText PDF you can convert HTML content to PDF.
Following examples help you in understanding, to some extent, conversion of XHTML to PDF.

Examples Using Flying Saucer API:
You require following libraries:

  • core-renderer.jar
  • iText-2.0.8.jar

You can find these resources in flyingsaucer-R8.zip.

Example1: Using XML Resource:

// if you have html source in hand, use it to generate document object
Document document = XMLResource.load( new ByteArrayInputStream( yourXhtmlContentAsString.getBytes() ) ).getDocument();

ITextRenderer renderer = new ITextRenderer();
renderer.setDocument( document, null );

renderer.layout();

String fileNameWithPath = outputFileFolder + "PDF-XhtmlRendered.pdf";
FileOutputStream fos = new FileOutputStream( fileNameWithPath );
renderer.createPDF( fos );
fos.close();
System.out.println( "File 1: '" + fileNameWithPath + "' created." );

Example2: Using XHTML direct input to Document:

ITextRenderer renderer = new ITextRenderer();

// if you have html source in hand, use it to generate document object
renderer.setDocumentFromString( yourXhtmlContentAsString );
renderer.layout();

String fileNameWithPath = outputFileFolder + "PDF-FromHtmlString.pdf";
FileOutputStream fos = new FileOutputStream( fileNameWithPath );
renderer.createPDF( fos );
fos.close();

System.out.println( "File 2: '" + fileNameWithPath + "' created." );

Examples Using iText API:
You require following libraries:

  • core-renderer.jar
  • itextpdf-5.2.1.jar

You can find these resources at here.

Example3: Using HTML Worker:

com.itextpdf.text.Document document =
        new com.itextpdf.text.Document( com.itextpdf.text.PageSize.A4 );
String fileNameWithPath = outputFileFolder + "PDF-HtmlWorkerParsed.pdf";
FileOutputStream fos = new FileOutputStream( fileNameWithPath );
com.itextpdf.text.pdf.PdfWriter pdfWriter =
        com.itextpdf.text.pdf.PdfWriter.getInstance( document, fos );

document.open();

//**********************************************************
// if required, you can add document meta data
document.addAuthor( "Ravinder" );
//document.addCreator( creator );
document.addSubject( "HtmlWoker Parsed Pdf from iText" );
document.addCreationDate();
document.addTitle( "HtmlWoker Parsed Pdf from iText" );
//**********************************************************/

com.itextpdf.text.html.simpleparser.HTMLWorker htmlWorker =
        new com.itextpdf.text.html.simpleparser.HTMLWorker( document );
htmlWorker.parse( new StringReader( sb.toString() ) );

document.close();
fos.close();

System.out.println( "File 3: '" + fileNameWithPath + "' created." );
Ravinder Reddy
  • 23,692
  • 6
  • 52
  • 82
  • sorry for a late reply..... htmlWorker.parse( new StringReader( sb.toString() ) ); I'm getting a JSP output...how do I convert that to HTML. I think sb is the XHTML/HTML content. Is that right? – Harry Jun 04 '12 at 18:42
  • Arabic letters is not shown with your code. Is there a way ? – utarid Jul 26 '14 at 19:02
  • @user4757345: Did you see that the `'html'` content type is set like `"text/html; charset=utf-8"`. Basically it is the HTML instruction so that the parser understands and uses the same while building the PDF page. – Ravinder Reddy Jul 27 '14 at 04:33
  • I add "" but nothig change. You can see my html in "[http://stackoverflow.com/questions/24976168/converting-html-with-arabic-letters-to-pdf]". Thank you. – utarid Jul 27 '14 at 09:09
  • 2
    @Ravinder I wish I could give you +100. I spent so much time dealing with this, all different messy libraries I have used etc. Thanks so much! – Ondrej Tokar Apr 29 '15 at 14:15
  • What if I do not have html source in string? Is there any way that I can give path of my HTML file and it will convert it to pdf? – Half Blood Prince May 20 '16 at 10:32
  • `sb.toString() ` - what is `sb`? – Half Blood Prince May 20 '16 at 10:35
  • Ya... I forgot to add its source variable. `sb` is the `StringBuffer` representation of `xhtml` or `html` content in context. – Ravinder Reddy May 20 '16 at 11:06
  • And you can read the `html` content from file and can store in `StringBuffer` – Ravinder Reddy May 20 '16 at 11:08
  • Sorry but I am using jsp page with elemnts adding dynamically. What should I do? My need is I want a print button and it should give me the snapshot of that page in pdf. – Half Blood Prince May 20 '16 at 11:43
  • Is there a way to create the PDF on a LANDSCAPE? position – Samarland Jan 04 '17 at 21:49
  • _**@Samarland**_: You can read and follow the [answer](http://stackoverflow.com/a/14600433/767881) given by [Bruno Lowagie](http://stackoverflow.com/users/1622493/bruno-lowagie), original developer of iText. Comments on that answer shall also be helpful. – Ravinder Reddy Jan 06 '17 at 06:24
-2

You can use JTidy framework for this. This will convert HTML to XHTML and translate again to XSL-FO. This object can be used for generating PDF.

http://www.javaworld.com/javaworld/jw-04-2006/jw-0410-html.html

bluish
  • 26,356
  • 27
  • 122
  • 180
UVM
  • 9,776
  • 6
  • 41
  • 66