Sth should be all letters or in regular expressions [A-Z]. How to combine xslt with regular expressions?
<xsl:if test="string-contains(//ns0:elem/value, 'sth')">
</xsl:if>
Sth should be all letters or in regular expressions [A-Z]. How to combine xslt with regular expressions?
<xsl:if test="string-contains(//ns0:elem/value, 'sth')">
</xsl:if>
XPath/XSLT 1.0 does not support regular expressions, but simple validation can be performed using the basic string functions.
Whitelisting
The XPath 1.0 translate
function can be used to simulate a whitelist:
<xsl:variable name="alpha"
select="'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:if test="string-length(translate(., $alpha, '')) > 0">
<!-- context node contains non-alpha characters -->
</xsl:if>
The test uses translate
to first remove all upper- and lower-case letters. If the resulting string's length is non-zero, then the original string must have contained additional characters.
Note that the expression above could be simplified to:
<xsl:if test="translate(., $alpha, '')">
... because any non-empty string evaluates to true.
Blacklisting
Use the double-translate method to treat $alpha
as a blacklist:
<xsl:if test="translate(., translate(., $alpha, ''), '')">
<!-- context-node contains characters on the blacklist (in $alpha) -->
</xsl:if>
The inner translate
returns a string with all its alpha characters removed, which is then used as the template to the second translate
call, resulting in a string containing only the alpha characters. If this string is non-zero, then we found a character on the blacklist. This is a classic approach. See, for example, this previous question on SO:
A blacklist test could also be performed like this:
not(string-length(translate(., $alpha, ''))=string-length())
If the length of the string after removing all of the blacklisted characters is not equal to the length of the original string, then the string must have contained a character on the blacklist.
Summary
Blacklists and whitelists are really two sides of the same coin. The following demonstrates their usage together:
<xsl:if test="translate(., $alpha, '')">
[contains some characters not on the list]
</xsl:if>
<xsl:if test="not(translate(., $alpha, ''))">
[contains only characters on the list]
</xsl:if>
<xsl:if test="translate(., translate(., $alpha, ''), '')">
[contains some characters on the list]
</xsl:if>
<xsl:if test="not(translate(., translate(., $alpha, ''), ''))">
[contains only characters not on the list]
</xsl:if>
In XPath 1.0 (works with XSLT 1.0), you don't have tool for regex (you just have functions like contains
or starts-with
, see link below for more information)
In XPATH 2.0 (works with XSLT 2.0), you've got the matches
function and the replace
function (see link below).
URL to see :
XPath 1.0 string functions : http://www.w3.org/TR/xpath/#section-String-Functions
XPath 2.0 regex functions : http://www.w3.org/TR/xpath-functions/#string.match
How to search for letters in a string if you can't use regex in xslt?
Good question, +1.
Use:
translate(., translate(., $vAlpha, ''), '')
This produces all characters in the string value of the current node, that are in $vAlpha
In case you want to see if a given string $str
contains only letters and no other character, use:
string-length(translate($str, $vAlpha, '')) = 0
A complete code example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:variable name="vUpper" select=
"'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="vLower" select=
"'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="vAlpha" select=
"concat($vUpper, $vLower)"/>
<xsl:variable name="vStr" select="'A12B_..c02d'"/>
<xsl:template match="/">
<xsl:value-of select=
"translate($vStr,
translate($vStr, $vAlpha, ''), '')
"/>
The string <xsl:value-of select="$vStr"/> has <xsl:text/>
<xsl:value-of select=
"string-length(translate($vStr, $vAlpha, ''))"/> <xsl:text/>
non-letters
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on any XML document (not used), the wanted, correct result is produced:
ABcd
The string A12B_..c02d has 7
non-letters
Remember: The first XPath expression above demonstrates the so called "double-translate method", first proposed by @Michael Kay.