0

My application isn't creating content nodes for empty tokenized strings, so I'm wondering if it's the app or something in xslt that I'm messing up.

I have a string in a table row like:

|0001|United Health Foundation|10 Circle LN||New York|NY|

and like this

|0002|Red Cross|20 Bender LN|Suite 20|New York|NY|

So I'm tokenizing on the '|'

<xsl:template match="/">

      <xsl:apply-templates select="//tr" />

 </xsl:template>

 <xsl:template match="tr">

   <xsl:for-each select=".">
     <xsl:variable name="part" select="str:tokenize(.,'|')" />
    <document>
    <content name="id">
      <xsl:value-of select="$part[1]" />
    </content>
    <content name="bizName">
      <xsl:value-of select="$part[2]" />
    </content>
    <content name="street1">
      <xsl:value-of select="$part[3]" />
    </content>
    <content name="street2">
      <xsl:value-of select="$part[4]" />
    </content>
    <content name="city">
      <xsl:value-of select="$part[5]" />
    </content>
    <content name="state">
      <xsl:value-of select="$part[6]" />
    </content>
   </document>
  </xsl:for-each>
 </xsl:template>

The problem that I have is the first row has an empty string value for the 'street2' node (||). So my application is pushing everything one position left and street2 has the city value, city value has the state value and so on.

Can someone recommend a way to fix this or is this most likely something in my application?

Thanks for any help

rally_point
  • 89
  • 1
  • 1
  • 9
  • 1
    Maybe I'm missing something...what namespace is the "str" prefix associated with? Also, is this XSL 1.0 or 2.0? (It would help to see the top-level `xsl:stylesheet` or `xsl:transform` element) – Cheran Shunmugavel Jul 08 '14 at 07:00
  • There's a quirk in the EXSLT tokenize function that skips blank results. The code for the function is here, if you want to see for yourself: http://www.exslt.org/str/functions/tokenize/str.tokenize.function.xsl. You could extrapolate that code, or the code from michael's answer. – Flynn1179 Jul 08 '14 at 09:46

2 Answers2

1

That's how the str:tokenize() function works. If you want a different result, use a recursive template instead - see, for example: How to split string in XML

Community
  • 1
  • 1
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
0

You appear to be using the EXSLT extension function str:tokenize rather than the XPath 2.0 function fn:tokenize(). If you used the fn:tokenize(), I think your code would do what is expected. I'm surprised that your implementation of str:tokenize() is behaving this way, but it's rather informally specified and there are no conformance tests, so you're just unlucky in your choice of implementation.

Note that in an XPath 2.0 regular expression (as defined for fn:tokenize()), "|"needs to be escaped as "\|".

Michael Kay
  • 156,231
  • 11
  • 92
  • 164