20
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;
import com.itextpdf.text.Document;
import com.itextpdf.text.Paragraph;
import com.itextpdf.text.pdf.PdfWriter;

public class GeneratePDF {
    public static void main(String[] args) {
        try {

            String k = "<html><body> This is my Project </body></html>";

            OutputStream file = new FileOutputStream(new File("E:\\Test.pdf"));

            Document document = new Document();
            PdfWriter.getInstance(document, file);

            document.open();

            document.add(new Paragraph(k));

            document.close();
            file.close();

        } catch (Exception e) {

            e.printStackTrace();
        }
    }
}

This is my code to convert HTML to PDF. I am able to convert it but in PDF file it saves as whole HTML while I need to display only text. <html><body> This is my Project </body></html> gets saved to PDF while it should save only This is my Project.

Alexis Pigeon
  • 7,423
  • 11
  • 39
  • 44
Aman Kumar
  • 323
  • 4
  • 5
  • 11

2 Answers2

50

You can do it with the HTMLWorker class (deprecated) like this:

import com.itextpdf.text.html.simpleparser.HTMLWorker;
//...
try {
    String k = "<html><body> This is my Project </body></html>";
    OutputStream file = new FileOutputStream(new File("C:\\Test.pdf"));
    Document document = new Document();
    PdfWriter.getInstance(document, file);
    document.open();
    HTMLWorker htmlWorker = new HTMLWorker(document);
    htmlWorker.parse(new StringReader(k));
    document.close();
    file.close();
} catch (Exception e) {
    e.printStackTrace();
}

or using the XMLWorker, (download from this jar) using this code:

import com.itextpdf.tool.xml.XMLWorkerHelper;
//...
try {
    String k = "<html><body> This is my Project </body></html>";
    OutputStream file = new FileOutputStream(new File("C:\\Test.pdf"));
    Document document = new Document();
    PdfWriter writer = PdfWriter.getInstance(document, file);
    document.open();
    InputStream is = new ByteArrayInputStream(k.getBytes());
    XMLWorkerHelper.getInstance().parseXHtml(writer, document, is);
    document.close();
    file.close();
} catch (Exception e) {
    e.printStackTrace();
}
bluish
  • 26,356
  • 27
  • 122
  • 180
MaVRoSCy
  • 17,747
  • 15
  • 82
  • 125
  • 1
    But Html Worker class is not working its deprecated so can u please tell me which jar file we need for Html Workerclass? – Aman Kumar Jul 24 '13 at 06:18
  • see my update on imports – MaVRoSCy Jul 24 '13 at 06:22
  • But still This is my Project this displaying in Pdf while i need to display only This my project only inner text of html – Aman Kumar Jul 24 '13 at 06:24
  • and something else, Deprecated means method or class is still usable, but you should not use it. It will gradually be phased out. There is a new method/class to do the same thing. – MaVRoSCy Jul 24 '13 at 06:24
  • Have you changed the file path? C to E ? – MaVRoSCy Jul 24 '13 at 06:26
  • Thanx I got sorry i dint change the path now i have changed and its working – Aman Kumar Jul 24 '13 at 06:30
  • One more thing i need to ask i m getting Dynamic Html on Button click i converting that Html to Pdf but its showing Error that There was Error opening this document this file already use or open by another application – Aman Kumar Jul 24 '13 at 06:43
  • Create another question and show the code and the stacktrace of it – MaVRoSCy Jul 24 '13 at 06:45
  • http://stackoverflow.com/questions/17827136/com-itextpdf-tool-xml-exceptions-runtimeworkerexception-in-java check this My question – Aman Kumar Jul 24 '13 at 06:55
  • With XMLWorkerHelper, I'm getting `RuntimeWorkerException: Invalid nested tag head found, expected closing tag meta.` – Drazen Bjelovuk Aug 25 '14 at 17:15
  • Do you mean import com.lowagie.text.html.simpleparser.HTMLWorker? You wrote above import com.itextpdf.text.html.simpleparser.HTMLWorker; Where can I get that class from? Also, if I use com.lowagie.text.html.simpleparser.HTMLWorker I always get compilation error: The type com.itextpdf.text.pdf.PdfWriter cannot be resolved. It is indirectly referenced from required .class files. How to fix that? – ParagJ Jan 30 '15 at 10:17
  • It generates me an empty pdf ;/ – Ondrej Tokar Apr 29 '15 at 12:26
  • @OndrejTokar create a question with your code and i will have a look at it – MaVRoSCy Apr 29 '15 at 12:35
  • Here you go: http://stackoverflow.com/questions/29944021/converting-html-to-pdf-with-itext-library-makes-an-empty-pdf :) – Ondrej Tokar Apr 29 '15 at 12:41
  • You should close your resources in a finally block or use the try-with-resources construct. – Adriaan Koster Jan 04 '16 at 11:29
  • 1
    Please show your imports. I can not find `PdfWriter` class in jar. – Half Blood Prince May 20 '16 at 06:06
  • It appears HTMLWorker has been removed in iText 7 – G_V May 28 '19 at 11:53
1

This links might be helpful to convert.

https://code.google.com/p/flying-saucer/

https://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html

If it is a college Project, you can even go for these, http://pd4ml.com/examples.htm

Example is given to convert HTML to PDF

Jayesh
  • 6,047
  • 13
  • 49
  • 81