1

I am trying to write an .xslt file (as input for Apache FOP) whose purpose is to generate an "accessible" document (in our case that means: the generated PDF file must pass the checks done by PAC - the PDF accessibility checker).

My file is still practically in a "hello world"-state:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:fo="http://www.w3.org/1999/XSL/Format"
    xml:lang="en_US">
    <xsl:output method="xml" indent="yes" />

    <xsl:template match="/">
        <fo:root font-family="Arial">
            <fo:layout-master-set>
                <fo:simple-page-master
                    master-name="A4-portrait" page-height="29.7cm" page-width="21.0cm">
                    <fo:region-body region-name="main-content" margin="2cm" />
                </fo:simple-page-master>
            </fo:layout-master-set>
            <fo:page-sequence master-reference="A4-portrait">
                <fo:flow flow-name="main-content">
                    <fo:block font-family="Arial">Hello, <xsl:value-of select="data/name" />! This is some sample text.</fo:block>
                </fo:flow>
            </fo:page-sequence>
        </fo:root>
    </xsl:template>

</xsl:stylesheet>

but I am already getting a couple of "accessibility errors".

One is "Text object not tagged". The text object referenced is the fo:block containg the "Hello ..." text. I googled around for quite some time but I found no helpful explanation or description or any guidance what has to be done in such an .fo/.xslt file to create a tagged element in the resulting PDF.

Does some kind soul have an idea or a good link or description on that?

Later addition:

to force FOP to locate the required fonts one has to add snippets like this to the conf\fop.xconf-file:

      <fonts>
       ...
       <font kerning="yes" embed-url="file:///c:/Windows/Fonts/arial.ttf" embedding-mode="subset">
          <font-triplet name="Arial" style="normal" weight="normal"/>
        </font>
        ...
      </fonts>

Note that these URLs - when pointing to lcoal files - have to contain the file:-protcocol and they need to contain a drive specifier (here: c:). They didn't work for me without these.

mmo
  • 3,897
  • 11
  • 42
  • 63

1 Answers1

2

In order to add accessibility information to the resulting PDF, you can use the role attribute on the FO elements to specify the structural / semantical function of the contained text.

From FOP's Accessibility page:

The PDF Reference defines a set of standard Structure Types to tag content. For example, ‘P’ is used for identifying paragraphs, ‘H1’ to ‘H6’ for headers, ‘L’ for lists, ‘Div’ for block-level groups of elements, etc. This standard set is aimed at improving interoperability between applications producing or consuming PDF.

FOP provides a default mapping of Formatting Objects to elements from that standard set. For example, fo:page-sequence is mapped to ‘Part’, fo:block is mapped to ‘P’, fo:list-block to ‘L’, etc.

You may want to customize that mapping to improve the accuracy of the tagging or deal with particular FO constructs. For example, you may want to make use of the ‘H1’ to ‘H6’ tags to make the hierarchical structure of the document appear in the PDF. This is achieved by using the role XSL-FO property:

The link to the PDF Reference seems to be broken, so here are some working links:

  • PDF Reference, version 1.4 (the one FOP's site seems to reference): look at section 9.7 Tagged PDF, in particular 9.7.4 Standard Structure Types for the possible values for role

Note that, even you don't use the role attribute, FOP should be using a default value (for example, P for fo:block elements). So, if the accessibility checker gives you a warning, maybe there is something else wrong:

  1. make sure you have enabled the accessibility features (either with the -a command line option, setting userAgent.setAccessibility(true) in the Java code, or adding <accessibility>true</accessibility> in the fop.xconf configuration file)
  2. you must also specify that you want a PDF/A to be created (more specifically, a PDF/A-1a) rather than a "normal" one (either with the -pdfprofile PDF/A-1a command line option, userAgent.getRendererOptions().put("pdf-a-mode", "PDF/A-1a") Java instruction, or <pdf-a-mode>PDF/A-1a</pdf-a-mode> element in the configuration file)
lfurini
  • 3,729
  • 4
  • 30
  • 48
  • 1
    Thanks for the misc. links and comments! The FOP accessibility page (at the top) I had seen and the features (at the bottom) I already have enabled (in the fop.xconf-file). The role stuff I had seen and tried it but it didn't change anything. In fact I had added a role to the block, i.e. `fo:block role="P" ...` - but the PAC output didn't change with that. The text block is still marked as "Text object not tagged". – mmo Aug 25 '23 at 12:33
  • 1
    Matbe the reason is a different one: There is another error stating "Document is not marked as tagged". Maybe it only lacks that marker? But how/where to set that? – mmo Aug 25 '23 at 12:38
  • Thanks for updating your solution. I adapted my setup accordingly but now I get: ```INFO: Rendered page #1. Aug. 26, 2023 7:47:04 NACHM. org.apache.fop.cli.InputHandler error SEVERE: javax.xml.transform.TransformerException: For PDF/A-1a, all fonts, even the base 14 fonts, have to be embedded! Offending font: /Helvetica``` Any idea where that might come from? As you can see in my snippet I am not specifying "Helvetica" anywhere. What would I need to do to embed that font? – mmo Aug 26 '23 at 17:48
  • 1
    @mmo In your fo you are referencing the Verdana font family, but if you don't map it into a font file then FOP will use a default base 14 font instead (Helvetica, apparently). You have to [configure the fonts you are using](https://stackoverflow.com/a/28251945/4453460), mapping them to a font file in FOP's configuration. – lfurini Aug 26 '23 at 18:34
  • I defined several fonts (Arial, Courier, Verdana) in `conf\fop.xconf` and added a corresponding `embed-url="/Windows/Fonts/....ttf` argument. I also added `font="..."` arguments to all my `fo:blocks` but FOP insists on `offending font: /Helvetica`. But: on my Windows system I don't have a font "Helvetica(.ttf)"! What a pain this FOP is! How does one teach this beast to use a different default font? – mmo Aug 28 '23 at 12:11
  • 1
    @mmo The point of the error message is _not_ that there is/must be an Helvetica font in your pc; it simply means that somewhere in the FO file there is some text that ends up referencing that font family, either explicitly or as a default value. You can either track down that text and give it a font-family you have configured, or update the configuration to map Helvetica to a font file you have (even if it's not called Helvetica.ttf). – lfurini Aug 28 '23 at 12:33
  • 1
    I understand that and I have now marked the root node with `` - that seems to have stopped it from searching for Helvetica (found that hint in another thread). – mmo Aug 29 '23 at 07:20
  • 1
    FINALLY! After pressing FOPs nose into the Fonts folder (see my addition in the description) it finally managed to locate these. Thanks again for helping me! – mmo Aug 29 '23 at 07:21