I need some XSLT (or something - see below) to replace newlines in all attributes with an alternative character.
I am having to process legacy XML which stores all data as attributes, and uses new-lines to express cardinality. For example:
<sample>
<p att="John
Paul
Ringo"></p>
</sample>
These new-lines are being replaced with whitespace when I parse the file in Java (as per the XML spec), however I am wishing to treat them as a list so this behaviour isn't particularly useful.
My 'solution' was to use XSLT to replace all newlines in all attributes with some other delimiter - but I have zero knowledge of XSLT. All examples I've seen thus far have either been very specific or have replaced node content instead of attribute values.
I have dabbled with XSLT 2.0's replace()
but am having a hard time putting everything together.
Is XSLT even the correct solution? With the XSLT below:
<xsl:template match="sample/*">
<xsl:for-each select="@*">
<xsl:value-of select="replace(current(), '\n', '|')"/>
</xsl:for-each>
</xsl:template>
applied to the sample XML outputs the following using Saxon:
John Paul Ringo
Obviously this format isn't what I'm after - this is just to experiment with replace()
- but have the newlines already been normalised by the time we get to XSLT processing? If so, are there any other ways to parse these values as writ using a Java parser? I've only used JAXB thus far.