0

I would like to tokenize string with a word. I am using str:tokenize() but it seems that delimiter might be only single char, even the delimiter consist of more characters, the tokenize() does search for listed characters in the delimiter.

E.g.

InputString : "first|second|third|@@|First|Second|Third  
Delimiter : |@@|  
str:tokenize(&InputString, '|@@|') 

but this returns 6 rows instead of 2

I need to have it in for-each due to next operations with tokenized "sentences"

What am I doing wrong with str:tokenize() ?

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
Matt
  • 11
  • 4
  • Are you restricted to an XSLT 1 processor and EXSLT extensions? XSLT 2 or 3 relying on XPath 2 or 3 have the `tokenize` function you could use, although given that its second argument is a regular expression pattern where characters like `|` are meta characters you would need to make sure you run your delimiter through an escape mechanism e.g. `tokenize('"first|second|third|@@|First|Second|Third', '\|@@\|')` finds two tokens: https://xqueryfiddle.liberty-development.net/6qM2e2e. http://www.xsltfunctions.com/xsl/functx_escape-for-regex.html is a function to escape any characters that need it. – Martin Honnen Nov 18 '18 at 14:24
  • Xalan , XSLT 1 "\" doesn't work :( – Matt Nov 18 '18 at 23:00

2 Answers2

0

The result you get is conforming to the specification for the str:tokenize function:

The second argument is a string consisting of a number of characters. Each character in this string is taken as a delimiting character. The string given by the first argument is split at any occurrence of any of these characters.


What you want to do is split the given string at any occurrence of a given pattern instead. For this, you need to use the str:split function - if your processor supports it.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
0

thx for advise.

I am attaching example code which match my needs- may be it would help someone to save his/her time. It takes element and parse it according to two delimiters, plus it draws all cells (including empty one - even last one is empty). If the second cell/column is empty it will skip whole row.

XML:

    <?xml version="1.0" encoding="UTF-8"?>
<catalog>
    <cd>
        <title>firs t|sec ond|third|@@|First|Second|Third</title>
        <artist>Bob Dylan</artist>
    </cd>
    <cd>
        <title>1||3|@@|4|5|6|@@|7|8|</title>
        <artist>Bob Dylan</artist>
    </cd>
    <cdd>
        <title>1|2|@@|3|4|@@||</title>
        <artist>Bob Dylan</artist>
    </cdd>
</catalog>

XSL:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:str="http://exslt.org/strings"
    xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="str exsl">
    <!-- Edit these parameters if necessary. -->
    <xsl:param name="rowDelimiter" select="'|@@|'"/>
    <xsl:param name="columnDelimiter" select="'|'"/>
    <!-- Edit these parameters if necessary. -->

    <xsl:template match="/">
        <html> 
            <body>
                <h2>My test</h2>
                <table border="1">
                    <tr bgcolor="#9acd32">
                        <th style="text-align:left">Title</th>
                    </tr>

                    <xsl:for-each select="catalog/cd">
                        <xsl:variable name="ercm_rows">
                            <xsl:call-template name="splitStringToRows">
                                <xsl:with-param name="list" select="title" />
                                <xsl:with-param name="delimiter" select="$rowDelimiter"/>
                            </xsl:call-template>
                        </xsl:variable>


                        <xsl:for-each select="exsl:node-set($ercm_rows)/ercm_row">
                                  <xsl:if test="./ercm_column[position()=2 != ''] and ./ercm_column[position()=2]/text()"> 
                            <tr>
                                <xsl:for-each select="./ercm_column">
                                    <td><xsl:value-of select="."/></td>
                                </xsl:for-each>
                            </tr>
                            </xsl:if> 
                        </xsl:for-each>

                    </xsl:for-each>
                </table>


            </body>
        </html>
    </xsl:template>

   <!--  ROWS SPLIT --> 
    <xsl:template name="splitStringToRows">
        <xsl:param name="list" />
        <xsl:param name="delimiter" />


        <xsl:variable name="newlist">
            <xsl:choose>
                <xsl:when test="contains($list, $delimiter)">
                    <xsl:value-of select="$list" />
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="concat($list, $delimiter)"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:variable>

        <xsl:variable name="first" select="substring-before($newlist, $delimiter)" />
        <xsl:variable name="remaining" select="substring-after($newlist, $delimiter)" />
        <ercm_row>
            <xsl:call-template name="splitStringToColumns">
                    <xsl:with-param name="list" select="$first" />
                    <xsl:with-param name="delimiter" select="$columnDelimiter"/>
                </xsl:call-template>
        </ercm_row>


        <xsl:if test="$remaining">
            <xsl:call-template name="splitStringToRows">
                <xsl:with-param name="list" select="$remaining" />
                <xsl:with-param name="delimiter" select="$rowDelimiter"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>

    <!--  COLUMNS SPLIT -->
    <xsl:template name="splitStringToColumns">
        <xsl:param name="list" />
        <xsl:param name="delimiter" />

        <xsl:variable name="newlist">
            <xsl:choose>
                <xsl:when test="contains($list, $delimiter)">
                    <xsl:value-of select="$list" />
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="concat($list, $delimiter)"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:variable>

        <xsl:variable name="first" select="substring-before($newlist, $delimiter)" />
        <xsl:variable name="remaining" select="substring-after($newlist, $delimiter)" />
        <ercm_column>
              <xsl:value-of select="$first"/>
        </ercm_column>

        <!--    <xsl:if test="$remaining" > -->
        <xsl:if test="contains($list, $delimiter)" >
            <xsl:call-template name="splitStringToColumns">
                <xsl:with-param name="list" select="$remaining" />
                <xsl:with-param name="delimiter" select="$columnDelimiter"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>



</xsl:stylesheet>

screenshot of Output vs XML

Matt
  • 11
  • 4
  • So which XSLT processor do you use that supports `str:tokenize` but not `str:split`? – michael.hor257k Nov 18 '18 at 22:31
  • tokenize is not good for me. it checks char by char and only not delimiter return. In case that first column is empty string it will completely shift all columns . E.g. it will start with the second column as the first one. – Matt Nov 18 '18 at 22:42
  • I find this very confusing. Your question is about one thing, your answer seems to be about something else altogether. I am not sure how this is supposed to help anyone. – michael.hor257k Nov 18 '18 at 22:54
  • it is simple : you said to use str:split instead of tokenize - good advise. I've separated rows to two instead of 6 rows. Then the code is extended about to split each row to columns with different delimiter. I looked for similar code but no success - this might be useful if only string in an XML element is available and we need to get it as node type to work with. I am not sure if there is simpler solution for XSLT1 but this one works for me well and may be can help next. – Matt Nov 18 '18 at 23:18
  • Splitting the row to columns can be accomplished using either `str:tokenize` or `str:split`. If you have issues with some cells being empty, you should have posted this as a separate question (which would have been closed as a duplicate - for example, of https://stackoverflow.com/questions/23597058/how-to-split-string-in-xml). – michael.hor257k Nov 19 '18 at 00:19