1

I'm trying to transform a simple HTML page to XSL-FO, to feed into Apache FOP for PDF rendering. The steps are: HTML+CSS -> XHTML -> XSL-FO -> PDF.

I've used the java library CSSToXSLFO to transform XHTML to XSL-FO. This works, however it's incapable of handling embedded images.

Are there any tools to transform

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>hello</title>
  </head>
  <body>
    <h1 style="color: green">Hello world!</h1>
    <img src="...=" />
  </body>
</html>

into

    <fo:flow flow-name="xsl-region-body">
      <fo:block>
        <fo:block color="green">Hello world!</fo:block>
        <fo:external-graphic src="url(...=)" content-height="scale-to-fit" content-width="scale-to-fit" scaling="uniform"/>
      </fo:block>
    </fo:flow>

?

TCookz
  • 11
  • 2
  • You state it is incapable of handling base64 encoded images yet and example exists http://www.cloudformatter.com/CSS2Pdf.Demos.Images so we need more info. Perhaps your image is to large? – Kevin Brown Apr 02 '21 at 15:40

1 Answers1

0

If the FOP processor supports data URIs in fo:external-graphic you can of course use XSLT to transform XHTML to XSL-FO with e.g.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:fo="http://www.w3.org/1999/XSL/Format"
    xpath-default-namespace="http://www.w3.org/1999/xhtml"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
        <fo:layout-master-set>
            <fo:simple-page-master master-name="sample">
                <fo:region-body/>
            </fo:simple-page-master>
        </fo:layout-master-set>
        <fo:page-sequence master-reference="sample">
            <xsl:apply-templates select="html/body"/>
        </fo:page-sequence>
    </fo:root>
  </xsl:template>
  
  <xsl:template match="body">
      <fo:flow flow-name="xsl-region-body">
          <fo:block>
              <xsl:apply-templates/>
          </fo:block>
      </fo:flow>
  </xsl:template>
  
  <xsl:template match="h1">
      <fo:block>
          <xsl:apply-templates/>
      </fo:block>
  </xsl:template>
  
  <xsl:template match="img">
      <fo:external-graphic src="{@src}" content-height="scale-to-fit" content-width="scale-to-fit" scaling="uniform"/>    
  </xsl:template>
  
</xsl:stylesheet>

That is a minimal example to handle the h1 and the img element, I haven't tried to spell out any HTML CSS style attribute to XSL-FO presentational attribute transformation but you can of course use e.g. <xsl:apply-templates select="@*, node()"/> instead of <xsl:apply-templates/> and then add templates to transform e.g. style="color: green" to color="green". As CSS has its own, non-XML syntax, obviously writing a full parser for arbitrary style attributes is a demanding task beyond the scope of StackOverflow answers.

I am also not quite sure about the allowed src attribute syntax in XSL-FO, FOP seems to understand the direct src="{@src}" just fine, but of course, to create the format you indicated in your question, you could as well use src="url({@src})".

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110