1

I have been trying to write HTML into a PDF/A compliant PDF for a while. I'm using the following libraries :

  • itext-pdfa-5.5.12
  • itextpdf-5.5.12
  • xmlworker-5.5.12

The following error seems to appear on a regular basis :

Exception in thread "main" com.itextpdf.text.pdf.PdfAConformanceException: All the fonts must be embedded. This one isn't: Helvetica

The message seems to be, in some cases, misleading. My code is very similar to another post. The key line is commented in code.

  • When commented, I got an exception about a missing Helvetica font.
  • When uncommented, the program executes without exception.

In the generated PDF file, I can see a single embedded font (ArialMT). I find the exception message quite odd but I cannot figure out why Helvetica appears when only Arial is used. Is this an issue (bug) or did I miss something ?

public class BugFontExceptionDemo {
    public static void main(String[] args) {
        StringBuffer buf = new StringBuffer();
        buf.append("<body>");
        buf.append("<h1 style=\"font-family:arial\">Text in arial</h1>");
        buf.append("</body>");

        OutputStream file = null;
        Document document = null;
        PdfAWriter writer = null;
        try {
            file = new FileOutputStream(
                    new File("C:\\Users\\Emilien\\PROJECTS_FILES\\PROJECT_EXPORT_TEST\\PDF_A_HTML_WORKING.pdf"));
            document = new Document();
            writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);
            document.addTitle("Test document");
            writer.createXmpMetadata();
            document.open();

            ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream(
                    "C:\\Users\\Emilien\\PROJECTS_FILES\\PROJECT_EXPORT_TEST\\sRGB_CS_profile.icm"));
            writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);

            CSSResolver cssResolver = new StyleAttrCSSResolver();
            CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
            cssResolver.addCss(cssFile);

            MyFontProvider fontProvider = new MyFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
//          fontProvider.register("./fonts/arial.ttf");

            CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
            HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
            htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());

            PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
            HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
            CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

            XMLWorker worker = new XMLWorker(css, true);
            XMLParser p = new XMLParser(worker);

            Reader reader = new StringReader(buf.toString());
            p.parse(reader);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            if (document != null && document.isOpen())
                document.close();
            try {
                if (file != null)
                    file.close();
            } catch (IOException e) {
            }
            if (writer != null && !writer.isCloseStream())
                writer.close();
        }
    }
}


public static class MyFontProvider extends XMLWorkerFontProvider {

        public MyFontProvider(String path) {
            super(path);
        }

        @Override
        public Font getFont(final String fontname, String encoding, float size, final int style) {
            System.out.println("registered: " + isRegistered(fontname) + " fontname: " + fontname + " encoding: "
                    + encoding + " size: " + size + " style: " + style);
            Font font = super.getFont(fontname, encoding, size, style);
            return font;
        }

        @Override
        public Font getFont(String fontname, String encoding, boolean embedded, float size, int style,
                BaseColor color) {
            System.out.println("registered: " + isRegistered(fontname) + " fontname: " + fontname + " encoding: "
                    + encoding + " embedded : " + embedded + " size: " + size + " style: " + style + " BaseColor: "
                    + color);
            Font font = super.getFont(fontname, encoding, embedded, size, style, color);
            return font;
        }
    }
Gordak
  • 2,060
  • 22
  • 32
  • 2
    You register your `MyFontProvider` with `DONTLOOKFORFONTS`; so it will not know any fonts (except probably the "built-in" standard 14 fonts) to start with. That will remain so if you don't register any fonts manually. When during HTML processing Arial is requested, therefore, a replacement from the standard 14 fonts is used, Helvetica in this case. But these "built-in" fonts are not built into iText, PDF viewers have to bring them along. Thus, iText will not embed the font program, merely reference it, which then results in the error message of the underlying PDF/A checking code. – mkl Oct 09 '17 at 16:48
  • Okay. Is it possible to ask the XMLWorker to throw exception when a font is requested but not available, instead of silently replacing the font with Helvetica ? – Gordak Oct 10 '17 at 06:55
  • After having a look at ChunkCssApplier, it seems that I can simply check the registration in the XMLWorkerFontProvider . If the font is not registered, I throw an exception. Thank you for your answer ! – Gordak Oct 10 '17 at 07:47
  • Ok, I'll make all that an actual answer. – mkl Oct 10 '17 at 08:09

1 Answers1

3

You register your MyFontProvider with a DONTLOOKFORFONTS parameter; so it will not know any fonts (except the "built-in" standard 14 fonts) to start with.

That will remain so if you don't register any fonts manually. That's exactly your situation with the line in focus being commented.

When during HTML processing Arial is requested, therefore, a replacement from the standard 14 fonts is used, Helvetica in this case. But iText does not bring along these "built-in" fonts, only PDF viewers have to, iText only knows some metrics of it. Thus, iText cannot and will not embed the font program, merely reference it, which then results in the error message of the underlying PDF/A checking code.


Then you wondered

Is it possible to ask the XMLWorker to throw exception when a font is requested but not available, instead of silently replacing the font with Helvetica?

Surely, as you found out yourself

I can simply check the registration in the XMLWorkerFontProvider. If the font is not registered, I throw an exception.

mkl
  • 90,588
  • 15
  • 125
  • 265