6

Sth should be all letters or in regular expressions [A-Z]. How to combine xslt with regular expressions?

 <xsl:if test="string-contains(//ns0:elem/value, 'sth')">


        </xsl:if>
marko
  • 10,684
  • 17
  • 71
  • 92

3 Answers3

11

XPath/XSLT 1.0 does not support regular expressions, but simple validation can be performed using the basic string functions.

Whitelisting

The XPath 1.0 translate function can be used to simulate a whitelist:

<xsl:variable name="alpha" 
              select="'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:if test="string-length(translate(., $alpha, '')) &gt; 0">
    <!-- context node contains non-alpha characters -->
</xsl:if>

The test uses translate to first remove all upper- and lower-case letters. If the resulting string's length is non-zero, then the original string must have contained additional characters.

Note that the expression above could be simplified to:

<xsl:if test="translate(., $alpha, '')">

... because any non-empty string evaluates to true.

Blacklisting

Use the double-translate method to treat $alpha as a blacklist:

<xsl:if test="translate(., translate(., $alpha, ''), '')">
    <!-- context-node contains characters on the blacklist (in $alpha) -->
</xsl:if>

The inner translate returns a string with all its alpha characters removed, which is then used as the template to the second translate call, resulting in a string containing only the alpha characters. If this string is non-zero, then we found a character on the blacklist. This is a classic approach. See, for example, this previous question on SO:

A blacklist test could also be performed like this:

not(string-length(translate(., $alpha, ''))=string-length())

If the length of the string after removing all of the blacklisted characters is not equal to the length of the original string, then the string must have contained a character on the blacklist.

Summary

Blacklists and whitelists are really two sides of the same coin. The following demonstrates their usage together:

<xsl:if test="translate(., $alpha, '')">
    [contains some characters not on the list]
</xsl:if> 
<xsl:if test="not(translate(., $alpha, ''))">
    [contains only characters on the list]
</xsl:if> 
<xsl:if test="translate(., translate(., $alpha, ''), '')">
    [contains some characters on the list]
</xsl:if>
<xsl:if test="not(translate(., translate(., $alpha, ''), ''))">
    [contains only characters not on the list]
</xsl:if>
Community
  • 1
  • 1
Wayne
  • 59,728
  • 15
  • 131
  • 126
2

In XPath 1.0 (works with XSLT 1.0), you don't have tool for regex (you just have functions like contains or starts-with, see link below for more information)

In XPATH 2.0 (works with XSLT 2.0), you've got the matches function and the replace function (see link below).

URL to see :

XPath 1.0 string functions : http://www.w3.org/TR/xpath/#section-String-Functions

XPath 2.0 regex functions : http://www.w3.org/TR/xpath-functions/#string.match

Vincent Biragnet
  • 2,950
  • 15
  • 22
  • 1
    I've got a solution and are using version 1.0. How to search for letters in a string if you can't use regex in xslt? – marko Dec 05 '11 at 14:18
2

How to search for letters in a string if you can't use regex in xslt?

Good question, +1.

Use:

translate(., translate(., $vAlpha, ''), '')

This produces all characters in the string value of the current node, that are in $vAlpha

In case you want to see if a given string $str contains only letters and no other character, use:

string-length(translate($str, $vAlpha, '')) = 0

A complete code example:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:variable name="vUpper" select=
  "'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>

 <xsl:variable name="vLower" select=
  "'abcdefghijklmnopqrstuvwxyz'"/>

 <xsl:variable name="vAlpha" select=
      "concat($vUpper, $vLower)"/>

 <xsl:variable name="vStr" select="'A12B_..c02d'"/>

 <xsl:template match="/">
  <xsl:value-of select=
    "translate($vStr,
               translate($vStr, $vAlpha, ''), '')
    "/>


    The string <xsl:value-of select="$vStr"/> has <xsl:text/>
    <xsl:value-of select=
    "string-length(translate($vStr, $vAlpha, ''))"/> <xsl:text/>
    non-letters

 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on any XML document (not used), the wanted, correct result is produced:

ABcd


    The string A12B_..c02d has 7
    non-letters

Remember: The first XPath expression above demonstrates the so called "double-translate method", first proposed by @Michael Kay.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431