using itextpdf for creating pdf from html

Question

using itextpdf for converting a html to pdf. There are several images to be converted. However, sometimes image addresses are not correct. In such cases, I get a "file is not a recognized image format" error and the conversion stops.

I catch the exception but I don't know how to continue conversion bypassing that image.

Any ideas ?

my code :

        try {
            Document document = new Document( com.itextpdf.text.PageSize.A4 );
            String fileNameWithPath = filename;
            FileOutputStream fos = new FileOutputStream( fileNameWithPath );
            PdfWriter pdfWriter = PdfWriter.getInstance( document, fos );
            document.open();
            document.addCreationDate();
);
            HTMLWorker htmlWorker = new HTMLWorker( document );
            htmlWorker.parse( new StringReader( htmlText ) );
            document.close();
            fos.close();

            return true;
        } 
     catch(DocumentException e) {
    e.printStackTrace();
    return false;
    } catch (FileNotFoundException e) {
    e.printStackTrace();
    return false;
    } catch (UnsupportedEncodingException e) {
    e.printStackTrace();
    return false;
    } catch (IOException e) {
    e.printStackTrace();
    return false;
    }catch (Exception e) {
            File file = new File(absoluteFilePath);
            if(file.exists()) {
                boolean isDeleted = file.delete();
                Log.i(TAG, "PDF isDeleted: " + isDeleted);
            }
            Log.d(TAG, "Exception: " + e.getMessage());
            e.printStackTrace();
            return false;
        }
    }

and the error I get :

03-27 22:54:27.065: D/rerore(27830): Exception: file:/ is not a recognized imageformat. 03-27 22:54:27.065: W/System.err(27830): ExceptionConverter: java.io.IOException: file:/ is not a recognized imageformat. 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.Image.getInstance(Image.java:318) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.Image.getInstance(Image.java:340) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.html.simpleparser.ElementFactory.createImage(ElementFactory.java:425) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.html.simpleparser.HTMLWorker.createImage(HTMLWorker.java:454) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.html.simpleparser.HTMLTagProcessors$14.startElement(HTMLTagProcessors.java:431) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.html.simpleparser.HTMLWorker.startElement(HTMLWorker.java:193) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.xml.simpleparser.SimpleXMLParser.processTag(SimpleXMLParser.java:581) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.xml.simpleparser.SimpleXMLParser.go(SimpleXMLParser.java:323) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.xml.simpleparser.SimpleXMLParser.parse(SimpleXMLParser.java:607) 03-27 22:54:27.077: W/System.err(27830): at com.itextpdf.text.html.simpleparser.HTMLWorker.parse(HTMLWorker.java:147) 03-27 22:54:27.077: W/System.err(27830): at com.zagabun.yyy.MainActivity.createpdf(MainActivity.java:1039) 03-27 22:54:27.077: W/System.err(27830): at com.zagabun.yyy.MainActivity$6.run(MainActivity.java:974)

this is not a duplicate question because I've reviewed my question and deleted the old one.

also , now I have a solution for invalid images. I check the URL of each image before running the HTMLWorker parse command. However, there are other errors with other tags and I am not sure how error handling is done with iText 5.

can anybody help ?

You are using `HTMLWorker` and as documented, that class has been deprecated many years ago. Please stop using it. Upgrade to iText 7 and pdfHTML. — Bruno Lowagie, Mar 29 '18 at 16:05
iText 7 is too much complicated. Has many jar files and I don't know which one to use for just parsing html to pdf. Its documentation did not help me much. If you know an example which would help me , please tell me. — yasin tavukcuoglu, Mar 29 '18 at 18:18
oh by the way it seems that it is licenced now. I am looking for a free solution. — yasin tavukcuoglu, Mar 29 '18 at 20:31
You are already using iText 5 which is licensed under the AGPL. iText 7 is also licensed under the AGPL, so nothing changed. If you want a free solution, you will have to make your own solution free too. What is not fair about that. Are you working for free? — Bruno Lowagie, Mar 30 '18 at 09:43
why are we discussing commercial issues here , instead of providing help about code ? (oh btw, my app is (and will be) free) — yasin tavukcuoglu, Mar 30 '18 at 22:51
Show us where you've posted the source code, It's not sufficient for your app to be free of charge, it also has to be free as in free software (copyleft). Your allegation that iText 7 is too complex is strange, since the code needed to convert HTML to PDF is much easier in pdfHTML than before. See [the HTML to PDF tutorial](https://developers.itextpdf.com/content/itext-7-converting-html-pdf-pdfhtml/chapter-1-hello-html-pdf). As for your remark about the jars, that's irrelevant if you use Maven. Maven takes care of downloading all the jars (dependencies), — Bruno Lowagie, Mar 31 '18 at 08:34
Why are we discussing commercial issues? Because of your remark "it seems that it is licensed now"; iText has **always** been licensed, so that remark didn't make sense. Why am I not providing code? Because there is plenty of code in the [HTML to PDF tutorial](https://developers.itextpdf.com/content/itext-7-converting-html-pdf-pdfhtml), but you refuse to use that code. Maven solves the complexity you mention. If you don't know how to use Maven: ask which jars are needed to convert HTML to PDF instead of saying "it's too complex." — Bruno Lowagie, Mar 31 '18 at 08:38
I can't use maven in my compile environment. That's why it seemed too complex for me (too many jars) . That's my opinion. Of course everyone has his/her own opinion. And the turorial does not show which jar files to use. My intention is just converting a simple html with img tags to a pdf. So it should be only 1 or 2 jar files. But I don'T know which one. — yasin tavukcuoglu, Mar 31 '18 at 18:27
The old iText was monolithic and people complained that is wasn't modular. Now it's modular and you complain it's not monolithic. Why did I ever decide to be an open source developer? There is no gratitude. There are too many mean ungrateful people like you. (In case you didn't knoe: I'm the original developer of iText. And who are you?) — Bruno Lowagie, Mar 31 '18 at 20:33
who am I ? well, I am nobody, compared to "the original dveloper of iText". Thanks to your response. But I still could not solve the issue. Are you positive that there's no way of handling the errors in old iText ? I stil prefer it because I still don't know which files to use in the new iText. — yasin tavukcuoglu, Apr 05 '18 at 22:24
I once was a nobody too, you know. As for your question: it's `HTMLWorker`, that code is a mine field. Fix something on one place, and something explodes in another place. You should at least upgrade to XML Worker (although XML Worker is being discontinued too). — Bruno Lowagie, Apr 06 '18 at 06:11

using itextpdf for creating pdf from html

0 Answers0