2

I have an xml which is processed thru a java based sistem.

if any node on the xml contains a space the process will hang or error out.

so if i have this example how can i eliminate the spaces around the text nodes?

It is important to mention that this xpath should be "dinamic" so if the xml changes it will still catch those spaces.

 <bookstore>
    <book category="cooking">
        <title lang="en"> Everyday Italian</title>
        <author>Giada De Laurentiis </author>
        <year> 2005 </year>
    </book>
</bookstore>
Abel
  • 56,041
  • 24
  • 146
  • 247
Paulo Robles
  • 23
  • 1
  • 3

1 Answers1

1

Any place where you use an XPath and you retrieve its value, use normalize-space(YOUREXPR), where YOUREXPR is whatever your current expression is that returns spaces.

It will eliminate trailing and leading spaces, and will collapse any duplicate spaces.

Alternatively, since you said "if any node on the xml contains a space the process will hang or error out.", you can entirely remove all spaces by using translate(YOUREXPR, ' ', ''), which will turn Everyday Italian in EverydayItalian.

If you also want to eliminate newlines and tabs, you can use

translate(YOUREXPR, ' &#x9;&#xA;&#xD;', '')

Update: Based on your comment, it seems like you actually want to clean your file from excessive spaces. In that case, you should use XSLT. Something like the following will create a copy of your file with all excessive spaces in text nodes and attribute nodes removed (untested, but valid, if you are new to XSLT, I suggest to read up on it through any of the many tutorials or books):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    <xsl:strip-space elements="*"/>

    <xsl:template match="node()">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="@*">
        <xsl:attribute name="{local-name()}" namespace="{namespace-uri()}">
            <xsl:value-of select="normalize-space(.)" />
        </xsl:attribute>
    </xsl:template>

    <xsl:template match="text()">
        <xsl:value-of select="normalize-space(.)" />
    </xsl:template>
</xsl:template>
Abel
  • 56,041
  • 24
  • 146
  • 247
  • could you elaborate?how do i tell xpath to search thru the xml file? – Paulo Robles Sep 22 '15 at 22:42
  • @PauloRobles, that's quite a different question. That's essentially what XPath is about. Any XPath expression "searches" through the XML file. I.e. `//*` will return all elements at any depth. `//text()` will return all text at any depth, `//text()[normalize-space(.) != .]` will return all text nodes that have excessive spaces and `/bookstore/book[@category = 'cooking']` will return all books in that category. – Abel Sep 22 '15 at 22:46
  • @PauloRobles, I have updated my question with a suggestion on how to fully remove excessive whitespace using XSLT. XPath alone is good for querying, but will not change your input document. Use XSLT if you want to change it. PS, a good place to test your XPath is http://xpathtester.com and for XSLT http://xsltransform.net. – Abel Sep 22 '15 at 22:53
  • Thanks, i actually seemed to find a java script example that would help me do this... – Paulo Robles Sep 22 '15 at 23:10
  • @paulo, you said java in your question, javascript is something entirely different... – Abel Sep 22 '15 at 23:25
  • @Abel. *Trim* isn't the same as `normalize-space()`, because the former must leave any internal whitespace unchanged and the latter generally violates this requirement. See this answer: http://stackoverflow.com/a/4409546/36305 . The FXSL code for `f:trim()` is here: http://fxsl.cvs.sourceforge.net/viewvc/fxsl/fxsl-xslt2/f/func-trim.xsl?revision=1.1&view=markup – Dimitre Novatchev Sep 26 '15 at 20:19