1

I'm trying to assign a variable a certain token from a large string. I first tokenize the string, then for each token I check if it contains a certain substring. If it does, I want to assign that token to the variable.

Lastly, I use that variable to set an attribute of a div.

I've tried this code below, which gives me the exact output i want in oXygen XML Editor. However, when I run the XML/XSLT file in IE (11), it simply just prints out the entire original string, meaing xhtmlVar in the XSLT below. The div doesn't even show up (it might be there in DOM, but I don't visually see it -- I'll recheck this momentarily).

XSLT

<!-- xhtmlVar variable is a large string -->    
<xsl:variable name="xhtmlVar"  select="metadata[@element='xhtml_head_item']"></xsl:variable>

<xsl:variable name="quoteChar">"</xsl:variable> <!-- for cleaning token below -->
<xsl:variable name="tokenized" select="tokenize($xhtmlVar,' ')"/>
<xsl:variable name="doi">
     <xsl:for-each select="$tokenized">
          <xsl:variable name="curtoken" select="."/>
          <!-- if token contains the string 'doi', assign it to the variable -->
          <xsl:if test="contains($curtoken, 'doi')">
                 <!-- return value while stripping some stuff (token looks like this: doi:asdasdasd") -->
                 <xsl:value-of select="translate(replace($curtoken, 'doi:', ''),$quoteChar,'')"></xsl:value-of>        
          </xsl:if>
     </xsl:for-each>
</xsl:variable>

<!-- pass $doi variable as attribute value in a div -->
<div type='medium' class='embed' handle='{$doi}'></div>

How can I achieve what I want? Am I doing something wrong? Any tips on how to more gracefully write the code above is also appreciated!

Thanks in advance!


Update:

I've changed my code to use EXSLT, as suggested by Martin Honnen below.

However, now the tokenize template seems to simply remove the specified delimiter instead of actually using it as a delimiter. Also, I can't figure out how to use a whitespace as a delimiter:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:exsl="http://exslt.org/common"
    xmlns:str="http://exslt.org/strings"
    extension-element-prefixes="exsl str"
    exclude-result-prefixes="xs"
    version="1.0">

    <xsl:import href="str/str.xsl" />
    <xsl:import href="str.tokenize/str.tokenize.template.xsl" />

...

        <xsl:variable name="quoteChar">"</xsl:variable>
        <xsl:variable name="spaceChar"> </xsl:variable>
        <xsl:variable name="tokenized">
            <xsl:call-template name="str:tokenize">
                <xsl:with-param name="string" select="$xhtmlVar" />
                <xsl:with-param name="delimiters" select="','" />
            </xsl:call-template>
        </xsl:variable>

             <!-- prevent tree fragment error with exsl:node-set -->
            <xsl:for-each select="exsl:node-set($tokenized)">
                <xsl:variable name="curtoken" select="."/>
                <xsl:value-of select="$curtoken"/>
                <xsl:text> Ha </xsl:text> 

                <!--  Nevermind checking if each token contains what I want for now... 
                <xsl:if test="contains($curtoken, 'doi')">
                    <xsl:value-of select="translate(str:replace($curtoken, 'doi:', ''),$quoteChar,'')"></xsl:value-of>        
                </xsl:if>-->
            </xsl:for-each>

Instead of printing out each token separated by the word "Ha", the code above will print out the entire string (every token) but the comma delimiter "," will be removed. "Ha" then appears at the very end. Am I perhaps using the node-set function incorrectly?

Also, if I try to use delimiters like $spaceChar or an entire word, such as 'than', I often get something along the lines of a "template instruction stack overflow" error.


Code per michael.hor 's answer works.

Using, str:replace() like so

<xsl:value-of select="translate(str:replace($curtoken, 'doi:', ''),$quoteChar,'')"/>

Gives me this error in oXygen XML, though:

    Severity: fatal
Description: java.lang.NoSuchMethodException: For extension function, could not find method org.apache.xalan.lib.ExsltStrings.replace([ExpressionContext,] #NODESET, #STRING, #STRING).
Checked both static and instance methods. - For extension function, could not find method org.apache.xalan.lib.ExsltStrings.replace([ExpressionContext,] #NODESET, #STRING, #STRING).
Checked both static and instance methods.
LazerSharks
  • 3,089
  • 4
  • 42
  • 67
  • 1
    The function `fn:tokenize` is an XSLT 2.0 function. Browsers only do XSLT 1.0, unless you use [Frameless](http://frameless.io/xslt/) or [Saxon CE](http://www.saxonica.com/ce/index.xml), which are both JavaScript libs to use from within the browser. Also: XSLT 1.0 is notoriously bad with strings, but examples on the web on tokenization in XSLT 1.0 are not that hard to find. – Abel Aug 06 '14 at 19:43
  • @Abel Ah, thanks for your input! Thanks for pointing me to these resources, they look promising. I'll look into these. – LazerSharks Aug 06 '14 at 20:54
  • @Abel I've taken a look at the Frameless and Saxon libraries but need some help clarifying how they work. If I import them into my page, I need to do a little bit more legwork translating my code above before it can run correctly right? Or should I not need to do much extra? – LazerSharks Aug 06 '14 at 23:17
  • I think that warrants a new question, it is not trivial enough to answer here in the comment thread. Also, Michael Kay, owner of Saxon, is very active in this forum, chances are that he will spot your question if you include Saxon-CE in the title. – Abel Aug 07 '14 at 16:44

2 Answers2

3

tokenize is not supported in XSLT/XPath 1.0 and that version is all that browsers support. If you want to use Xslt 2.0 in the browser then you need to look into Saxon CE.

Inside of IE the XSLT implementation is MSXML and for that http://exslt.org/str/functions/tokenize/index.html provides a tokenize implementation done with JScript.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
1

Re your updated question:

If your processor supports the EXSLT str:tokenize() extension function, then:

  1. you don't need to import anything;
  2. you can (and should) use the function, not the template; and
  3. the result of the function is already a node-set.

Try the following stylesheet as a test (it will work with any input):

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:str="http://exslt.org/strings"
extension-element-prefixes="str">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:param name="input" select="'some,comma,delimited,string'" />

<xsl:template match="/">
    <xsl:variable name="tokenized" select="str:tokenize($input, ',')" />
    <output>
        <xsl:for-each select="$tokenized">
            <xsl:value-of select="." />
            <xsl:text> Ha </xsl:text> 
        </xsl:for-each>
    </output>
</xsl:template>

</xsl:stylesheet>
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • @Gnuey "*Does it make a difference?*" Probably. AFAIK, the str:tokenize() extension function is **not** supported by Xalan-C (if that's what you're using). Still, you should be getting either an error or at least an empty `` element as the result, I think. – michael.hor257k Aug 08 '14 at 17:55
  • Woops, deleted my comment above because I found it works after fixing some clerical errors! The code worked splendidly for me after I used to create a variable within the template to pass to tokenize(). Is using `xsl:param` better in some way? Also, `str:tokenize` works, but `str:replace` does not (see error in the new update to question). I tried importing again, but I get the same error. Do I need to do anything else to use str:replace()? Since `tokenize` and `replace` are within the same exslt string library, it seems I should be able to use them in the same situation. – LazerSharks Aug 08 '14 at 19:05
  • Ah, this thread (http://stackoverflow.com/questions/3067113/xslt-string-replace) I found states that `replace()` isn't available in XSL 1.0. Using the template provided in the above thread, my code works. If true, it seems strange that `str:tokenize` is supported but not `str:replace`. – LazerSharks Aug 08 '14 at 19:36
  • @Gnuey "*Since tokenize and replace are within the same exslt string library, it seems I should be able to use them in the same situation.*" No. Each processor implements some subset of EXSLT extension functions and elements, as the programmers saw fit, regardless of the EXSLT module. AFAIK, there's no processor that supports str:replace(). You need to use a named recursive template for that - see, for example: http://stackoverflow.com/questions/24995282/xsl-for-converting-xml-to-csv-adding-quotes-to-the-end-based-on-data-field/25003101#25003101 – michael.hor257k Aug 08 '14 at 19:54