4

I'm having an issue where when I publish my modspecs to pdf (XSL-FO). My tables are having issues, where the content of a cell will overflow its column into the next one. How do I force a break on the text so that a new line is created instead?

I can't manually insert zero-space characters since the table entries are programmatically entered. I'm looking for a simple solution that I can just simply add to docbook_pdf.xsl (either as a xsl:param or xsl:attribute)

EDIT: Here is where I'm at currently:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:import href="urn:docbkx:stylesheet"/>
...(the beginning of my stylesheet for pdf generation, e.g. header and footer content stuff)
<xsl:template match="text()">
    <xsl:call-template name="intersperse-with-zero-spaces">
        <xsl:with-param name="str" select="."/>
    </xsl:call-template>
</xsl:template>
<xsl:template name="intersperse-with-zero-spaces">
    <xsl:param name="str"/>
    <xsl:variable name="spacechars">
        &#x9;&#xA;
        &#x2000;&#x2001;&#x2002;&#x2003;&#x2004;&#x2005;
        &#x2006;&#x2007;&#x2008;&#x2009;&#x200A;&#x200B;
    </xsl:variable>

    <xsl:if test="string-length($str) &gt; 0">
        <xsl:variable name="c1" select="substring($str, 1, 1)"/>
        <xsl:variable name="c2" select="substring($str, 2, 1)"/>

        <xsl:value-of select="$c1"/>
        <xsl:if test="$c2 != '' and
            not(contains($spacechars, $c1) or
            contains($spacechars, $c2))">
            <xsl:text>&#x200B;</xsl:text>
        </xsl:if>

        <xsl:call-template name="intersperse-with-zero-spaces">
            <xsl:with-param name="str" select="substring($str, 2)"/>
        </xsl:call-template>
    </xsl:if>
</xsl:template>

</xsl:stylesheet>

With this, the long words are successfully broken up in the table cells! Unfortunately, the side effect is that normal text elsewhere (like in a under sextion X) now breaks up words so that they appear on seperate lines. Is there a way to isolate the above process to just tables?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Ace
  • 821
  • 3
  • 16
  • 37
  • This looks more like a XSL-FO vocabulary question. I've retagged as such. If you think is an XSLT question, please provide input sample and desired output. –  Dec 03 '10 at 23:23
  • @Alejandro: Yes its technically an XSL-FO issue (since the problem doesn't exit in html). I guess I'm hoping for a way to add something to the xml. – Ace Dec 03 '10 at 23:29
  • Do you want an XSLT solution that will put zero-space characters into the text? If so, can you provide the smallest possible example of your XSL-FO and what text/where you need to be made splittable? – Dimitre Novatchev Jan 11 '11 at 20:53

2 Answers2

18

In the long words, try inserting a zero-width space character between the characters where a break is allowed.

mzjn
  • 48,958
  • 13
  • 128
  • 248
  • @mzjin: Can't manually add that character, since the column width is unknown, and the entry itself is generated at runtime – Ace Jan 10 '11 at 23:53
  • @mzjin: I'm using solution 1 and I've gotten very close, see the edit to my answer. – Ace Jan 11 '11 at 21:44
  • You should make sure that the template applies only to table cells. Try ``. – mzjn Jan 11 '11 at 21:56
  • @mzjin: is it because we're missing a `` ? – Ace Jan 11 '11 at 22:33
  • @Ace, it could be a context problem. Try this: ``. – mzjn Jan 11 '11 at 22:37
  • @mzjin: Nope, no luck with that :( – Ace Jan 11 '11 at 22:54
  • If you are using the DocBook 5 stylesheets, then you probably need to use ``. See http://www.sagehill.net/docbookxsl/ProcesingDb5.html#Db5Xslt. If this does not help, then you need to provide more details about your XML and XSLT setup. – mzjn Jan 11 '11 at 23:06
  • @mzjin: Nope, that creates an error, saying it doesn't understand the prefix. I've added the config stuff of my stylesheet on to my answer – Ace Jan 11 '11 at 23:17
  • OK, then you need to declare the prefix in your customization. See http://www.sagehill.net/docbookxsl/CustomDb5Xsl.html. But this only applies if you actually *are* using the Docbook 5 (namespace-aware stylesheets). – mzjn Jan 11 '11 at 23:22
  • @mzjin: YES! This is so close. Now the normal text is left alone, and the tables are the only place that the breaks are being put! But I just noticed that some of my table entries include a `` element. Is there a way I can exclude those? – Ace Jan 11 '11 at 23:36
  • Do you want to adjust text also if it is in a para in an entry? Then try ``. – mzjn Jan 11 '11 at 23:46
  • @mzjin: What I mean is that i want the template applied to all `` except where the entry contains a `` – Ace Jan 12 '11 at 17:32
  • @mzjin:Yes, perfect. The question has been answered to the fullest, though I'm curious about one last thing...Is there a way to modify the code above so that every 13 consectutive characters, if there is no space ' ', then add a zero-space character? – Ace Jan 12 '11 at 18:22
5

Since you're using XSLT 2.0:

<xsl:template match="text()">
  <xsl:value-of
      select="replace(replace(., '(\P{Zs})(\P{Zs})', '$1&#x200B;$2'),
                      '([^\p{Zs}&#x200B;])([^\p{Zs}&#x200B;])',
                      '$1&#x200B;$2')" />
</xsl:template>

This is using category escapes (http://www.w3.org/TR/xmlschema-2/#nt-catEsc) rather than an explicit list of characters to match, but you could do it that way instead. It needs two replace() because the inner replace() can only insert the character between every second character. The outer replace() matches on characters that are not either space characters or the character added by the inner replace().


Inserting after every thirteenth non-space character:

<xsl:template match="text()">
  <xsl:value-of
      select="replace(replace(., '(\P{Zs}{13})', '$1&#x200B;'),
                      '&#x200B;(\p{Zs})',
                      '$1')" />
</xsl:template>

The inner replace() inserts the character after every 13 non-space characters, and the outer replace() fixes it if the 14th character was a space character.


If you are using AH Formatter, then you can use axf:word-break="break-all" to allow AH Formatter to break anywhere within a word. See https://www.antenna.co.jp/AHF/help/en/ahf-ext.html#axf.word-break.

Tony Graham
  • 7,306
  • 13
  • 20