3

How to avoid solidus and double quote escaping of XML in JSON?

Given that

  1. solidus characters (aka forward slash, /) may, but need not, be escaped in JSON, and that
  2. XML attributes may use ' rather than " to avoid escaping in JSON string values,

what's the best way to realize these potential serialization improvements in XSLT?


This XML,

<?xml version="1.0" encoding="UTF-8"?>
<map xmlns="http://www.w3.org/2005/xpath-functions">
  <array key="o_array">
    <map>
      <string key="s/1">x/y/z</string>
    </map>
    <map>
      <string key="s2"><![CDATA[<a href="/x/y">Link</a> a/b "test"]]></string>
    </map>
  </array>
</map>

input to this XSLT,

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
  <xsl:output method="text"/>  
  <xsl:template match="/">
    <xsl:value-of select="xml-to-json(.,map{'indent':true()})"/>
  </xsl:template>
</xsl:stylesheet>

yields (via Saxon, XSLT Fiddle demo) this JSON output:

{ "o_array" : 
  [ 
    { "s\/1" : "x\/y\/z" },

    { "s2" : "<a href=\"\/x\/y\">Link<\/a> a\/b \"test\"" } ] }

For purposes of aesthetics (above JSON is unnecessarily ugly) and minimizing file size (after also disabling indentation), I would like to be generating the following JSON instead:

{ "o_array" : 
  [ 
    { "s/1" : "x/y/z" },

    { "s2" : "<a href='/x/y'>Link</a> a/b \"test\"" } ] }

Notes:

  • Single quotes: A Saxon-specific serialization option, saxon:single-quotes, seems tantalizing close to helping, but how to use this option with xml-to-json() is unclear to me.
  • Solidus: An XSLT serialization option, map{'method': 'json', 'use-character-maps': map{ '/': '/' }} as described by Martin Honnen, seems tantalizing close to helping, but, again, how to use this option with xml-to-json() escapes (ha) me.
  • string/@escape and string/@escape-key attributes, per my reading of the spec and confirmed via experimentation, cannot help here.
kjhughes
  • 106,133
  • 27
  • 181
  • 240

1 Answers1

3

The linked suggestion with a character map can only be used if you are willing to introduce a parse-json() => serialize(...) step:

. => xml-to-json() => parse-json() => serialize(map { 'method' : 'json', 'use-character-maps' : map { '/' : '/' } })

That way, with

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="3.0">

  <xsl:output method="text"/>

  <xsl:template match="/">
      <xsl:value-of select=". => xml-to-json() => parse-json() => serialize(map { 'method' : 'json', 'use-character-maps' : map { '/' : '/' } })"/>
  </xsl:template>

</xsl:stylesheet>

at https://xsltfiddle.liberty-development.net/b4GWVd/25 I get

{"o_array":[{"s/1":"x/y/z"},{"s2":"<a href=\"/x/y\">Link</a> a/b \"test\""}]}

To insert the Saxon specific serialization parameter on string values that are XML fragments I think you could try to run the input first through a mode that simply does another parsing and serialization step, only this time as

. => parse-xml-fragment() => serialize(map {
                        'method': 'xml',
                        QName('http://saxon.sf.net/', 'single-quotes'): true()
                    })

With Saxon 9.9 EE in oXygen and

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">

    <xsl:output method="text"/>

    <xsl:template match="/">
        <xsl:value-of
            select="
                $single-quotes => xml-to-json() => parse-json() => serialize(map {
                    'method': 'json',
                    'use-character-maps': map {'/': '/'}
                })"
        />
    </xsl:template>

    <xsl:variable name="single-quotes">
        <xsl:apply-templates mode="serialize-fragments"/>
    </xsl:variable>

    <xsl:mode name="serialize-fragments" on-no-match="shallow-copy"/>

    <xsl:template match="string" mode="serialize-fragments"
        xpath-default-namespace="http://www.w3.org/2005/xpath-functions">
        <xsl:copy>
            <xsl:apply-templates select="@*" mode="#current"/>
            <xsl:try
                select="
                    . => parse-xml-fragment() => serialize(map {
                        'method': 'xml',
                        QName('http://saxon.sf.net/', 'single-quotes'): true()
                    })">
                <xsl:catch select="string()"/>
            </xsl:try>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

I get

{"o_array":[{"s/1":"x/y/z"},{"s2":"<a href='/x/y'>Link</a> a/b \"test\""}]}
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Masterful! This answer employs so many impressive XSLT 3.0 JSON and serialization techniques that it answers not only my asked question completely but also many unasked questions as well. It's worthy of in-depth study by anyone needing to serialize XML within JSON. – kjhughes May 30 '19 at 23:38
  • There’s an unexpected side effect: the output JSON is not in the same order. Is there any trick to keep the order of the JSON document? – Michael Jul 27 '22 at 13:19