4

I've a problem with iText.

I've followed this link: How to export html page to pdf format?

My snippet:

    String str = "<html><head><body><div style=\"width:100%;height:100%;\"><h3 style=\"margin-left:5px;margin-top:40px\">First</h3><div style=\"margin-left:15px;margin-top:15px\"><title></title><p>sdasdasd shshshshdffgdfgd</p></div><h3 style=\"margin-left:5px;margin-top:40px\">The dream</h3><div style=\"margin-left:15px;margin-top:15px\"></div></div></body></head></html>";
    String fileNameWithPath = "/Users/cecco/Desktop/pdf2.pdf";


    com.itextpdf.text.Document document =
            new com.itextpdf.text.Document(com.itextpdf.text.PageSize.A4);
    FileOutputStream fos = new FileOutputStream(fileNameWithPath);
    com.itextpdf.text.pdf.PdfWriter pdfWriter =
            com.itextpdf.text.pdf.PdfWriter.getInstance(document, fos);

    document.open();

    document.addAuthor("Myself");
    document.addSubject("My Subject");
    document.addCreationDate();
    document.addTitle("My Title");

    com.itextpdf.text.html.simpleparser.HTMLWorker htmlWorker =
            new com.itextpdf.text.html.simpleparser.HTMLWorker(document);
    htmlWorker.parse(new StringReader(str.toString()));

    document.close();
    fos.close();

and work fine.

But tag style into h3 and div aren't considered.

enter image description here

But if I copy my html into http://htmledit.squarefree.com/ all is correct.

How can I solve this problem?

Community
  • 1
  • 1
CeccoCQ
  • 3,746
  • 12
  • 50
  • 81

1 Answers1

6

iText isn't the best Html Parser, but you can use Flying-Saucer for this. Flying-Saucer is build on top of iText but has a capable Xml / (X)Html parser. Short: Flying Saucer is perfect if you want html -> Pdf.

Here's how to generate the pdf from your string:

/*
 * Note: i filled something in the title-tag and fixed the head tag (the whole body-tag was in the head)
 */
String str = "<html><head></head><body><div style=\"width:100%;height:100%;\"><h3 style=\"margin-left:5px;margin-top:40px\">First</h3><div style=\"margin-left:15px;margin-top:15px\"><title>t</title><p>sdasdasd shshshshdffgdfgd</p></div><h3 style=\"margin-left:5px;margin-top:40px\">The dream</h3><div style=\"margin-left:15px;margin-top:15px\"></div></div></body></html>";

OutputStream os = new FileOutputStream(new File("example.pdf"));

ITextRenderer renderer = new ITextRenderer();
renderer.setDocumentFromString(str);
renderer.layout();
renderer.createPDF(os);

os.close();

But: FS supports only valid Html / Xhtml / xml, so make shure it is.

ollo
  • 24,797
  • 14
  • 106
  • 155
  • Changing to Flying Saucer and using it like in this answer solved all my html to pdf parsing problems. As ollo pointed out, you should first "tidy" the string to really be valid HTML. I used Jsoup to parse html, for this. – Steve Waters Aug 19 '16 at 12:33