4

one of the toughest challenges I have ever faced in XSLT designing ..

How to copy the unique characters in a given string ..
Test xml is:

<root>
<string>aaeerstrst11232434</string>
</root>

The output I am expecting is:

<string>aerst1234</string>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
Rookie Programmer Aravind
  • 11,952
  • 23
  • 81
  • 114
  • Not only has XSLT recursion, but it is a true functional programming language (even XSLT 1.0). Read about FXSL -- you'll find it interesting, useful and most powerful -- what makes easy even the most challenging problems in XSLT. – Dimitre Novatchev Feb 15 '10 at 20:41

4 Answers4

3

Use the following XPath one-liner:

codepoints-to-string(distinct-values(string-to-codepoints(.)))

A complete transformation using this is below:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>       
    <xsl:output method="text"/>

    <xsl:template match="string">
      <xsl:value-of select=
      "codepoints-to-string(distinct-values(string-to-codepoints(.)))
      "/>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the originally provided XML document:

<root>
    <string>aaeerstrst11232434</string>
</root>

the wanted result is produced:

aerst1234

In case an XSLT 1.0 solution is needed -- please, indicate so and I'll provide it.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
3

Here is an XSLT 1.0 solution:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:strip-space elements="*"/>
  <xsl:output method="text"/>

  <xsl:template match="string">
    <xsl:call-template name="unique">
      <xsl:with-param name="input" select="."/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="unique">
    <xsl:param name="input"/>
    <xsl:param name="output" select="''"/>
    <xsl:variable name="c" select="substring($input, 1, 1)"/>
    <xsl:choose>
      <xsl:when test="not($input)">
        <xsl:value-of select="$output"/>
      </xsl:when>
      <xsl:when test="contains($output, $c)">
        <xsl:call-template name="unique">
          <xsl:with-param name="input" select="substring($input, 2)"/>
          <xsl:with-param name="output" select="$output"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="unique">
          <xsl:with-param name="input" select="substring($input, 2)"/>
          <xsl:with-param name="output" select="concat($output, $c)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • I was in false assumption saying .. we cannot use templates as like functions .. But you have proved it wrong .. I am learning (not too late) about a Kind of recursive call-template .. Glad about it .. :-) – Rookie Programmer Aravind Feb 15 '10 at 12:36
1

Here is an XSLT 1.0 solution, shorter than the currently selected answer and easier to write as it uses the str-foldl template of FXSL.

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:f="http://fxsl.sf.net/"
 exclude-result-prefixes="f">

 <xsl:import href="str-foldl.xsl"/>
 <xsl:output method="text"/>

 <f:addUnique/>

 <xsl:variable name="vFunAddunique" select=
  "document('')/*/f:addUnique[1]
  "/>

    <xsl:template match="string">
      <xsl:call-template name="str-foldl">
        <xsl:with-param name="pFunc" select="$vFunAddunique"/>
        <xsl:with-param name="pA0" select="''"/>
        <xsl:with-param name="pStr" select="."/>
      </xsl:call-template>
    </xsl:template>

    <xsl:template match="f:addUnique" mode="f:FXSL">
      <xsl:param name="arg1"/>
      <xsl:param name="arg2"/>

      <xsl:value-of select="$arg1"/>
      <xsl:if test="not(contains($arg1, $arg2))">
       <xsl:value-of select="$arg2"/>
      </xsl:if>
    </xsl:template>
</xsl:stylesheet>

When the above transformation is applied to the originally provided source XML document:

<root>
    <string>aaeerstrst11232434</string>
</root>

the wanted result is produced:

aerst1234

Read more about FXSL 1.x (for XSLT 1.0) here, and about FXSL 2.x (for XSLT 2.0) here.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • By default, the XslCompiledTransform class disables support for the XSLT document() function and embedded scripting. These features can be enabled by creating an XsltSettings object that has the features enabled and passing it to the Load method. I had to modify the .Net code for using document() function. – Rookie Programmer Aravind Mar 09 '10 at 09:20
  • It is good, if security is important (and it should be important almost always), in this case to provide your oun XmlResolver, so that it doesn't allow the use of arbitrary URLs, such as absolute filepaths of xml files containing sensitive data. – Dimitre Novatchev Mar 09 '10 at 13:44
0

When I tried with a more complicated XML, then I encountered many problems with Martin Honnen's solution, it doesn't work with the below mentioned XML, so I prepared my own solution refering to Dimitre's this answer And also I could call it a more efficient solution:

Here is an input xml:

    <root>
      <string>aabcdbcd1abcdefghijklmanopqrstuvwxyzabcdefgh0123456789ijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz12312489796453134049446798421230156489413210315487804210313264046040489789789745648974321231564648971232344</string>
      <string2>oejrinsjfojofjweofj24798273492jfakjflsdjljk</string2>
    </root>

And here is the Working XSLT code:

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="text()">
    <xsl:call-template name="unique_chars">
      <xsl:with-param name="input" select="."/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="unique_chars">
    <xsl:param name="input"/>
    <xsl:variable name="c">
      <xsl:value-of select="substring($input, 1, 1)"/>
    </xsl:variable>
    <xsl:choose>
      <xsl:when test="not($input)"/>
      <xsl:otherwise>
        <xsl:choose>
          <xsl:when test="contains(substring($input, 2), $c)">
            <xsl:call-template name="unique_chars">
              <xsl:with-param name="input" select="substring($input, 2)"/>
            </xsl:call-template>
          </xsl:when>
          <xsl:otherwise>
            <xsl:value-of select="$c"/>
            <xsl:call-template name="unique_chars">
              <xsl:with-param name="input" select="substring($input, 2)"/>
            </xsl:call-template>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>
Community
  • 1
  • 1
Rookie Programmer Aravind
  • 11,952
  • 23
  • 81
  • 114
  • A few questions: 1. What are the reasons you don't just simply use the solution that I posted? 2. Why do you treat the space (' ') in a special way? And a statement: When it is known that the number of unique characters is significantly smaller than the total number of characters in the string, then a much more efficient algorithm exists, than the ones shown on this page. I leave finding and implementing this algorithm as an exercise to you :) – Dimitre Novatchev Mar 01 '10 at 13:55
  • (1)I use a .Net code to trigger the transformation, it doesn't allow document() function, :-|(2)It is not the target to treat space-char specially, I had simply used it for testing purpose and unknowingly posted the same[edited to avoid confusions] (3) I am happy to do homework, I'll try my best to bring up yet more efficient code:-) – Rookie Programmer Aravind Mar 01 '10 at 15:12
  • Hmmm... Where do I use the `document()` function? Aside from this, I have been using .NET XslCompiledTransform and XslTransform for many years and never had any problems with the document() function -- unless it is explicitly forbidden by settings and/or a URI-Resolver. – Dimitre Novatchev Mar 02 '10 at 02:05
  • @dimitre, Working fine after changing the .Net code. accepted your solution. thanx for pointing out the flaw. :-) – Rookie Programmer Aravind Mar 02 '10 at 06:46