2

I want to select nodes based on some variables. The XML code:

<data>
    <prot seq="AAA">
        <node num="1">1345</node>
        <node num="1">11245</node>
        <node num="2">88885</node>
    </prot>
    <prot seq="BBB">
        <node num="1">678</node>
        <node num="1">456</node>
        <node num="2">6666</node>
    </prot>
    <prot seq="CCC">
        <node num="1">111</node>
        <node num="1">222</node>
        <node num="2">333</node>
    </prot>
</data>

The XML that I want

<output>
    <prot seq="AAA">
        <node num="1">1345</node>
        <node num="2">88885</node>
    </prot>
    <prot seq="BBB">
        <node num="1">678</node>
        <node num="2">6666</node>
    </prot>
    <prot seq="CCC">
        <node num="1">111</node>
        <node num="2">333</node>
    </prot>
</data>

So, my idea has been to group the nodes with a xsl:key element, and then do a for-each of them. For example:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:key name="by" match="/data/prot" use="concat(@seq,'|',node/@num)"/>
    <xsl:template match="/">
        <root>
            <xsl:apply-templates select="/data/prot"/>
        </root>
    </xsl:template>
    <xsl:template match="/data/prot">
        <xsl:for-each select="./node">
            <xsl:for-each select="key('by',concat(current()/../@seq,'|',current()/@num))">
                node <xsl:value-of select="./node" />
            </xsl:for-each>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

but the output is not what I expected, and I cannot see what I am doing wrong. I would prefer to keep the for-each structure. It is just as if I was not using properly the xsl:key grouping features.

the output that I get, unwanted

<root>
    node 1345
    node 1345
    node 678
    node 678
    node 111
node 111</root>

And the code as it to be tested http://www.xsltcake.com/slices/sgWUFu/20

Thanks!

Community
  • 1
  • 1
GWorking
  • 4,011
  • 10
  • 49
  • 90
  • As a side note -- do use a true XSLT processor and not some "cake" that may or maynot be available whenever you need it and may be quite buggy. Good XSLT processors I recommend are: (XSLT 1.0) MSXML3,4,6, Saxon 6.5.4, .NET XslCompiledTransform, AltovaXML. (XSLT 2.0) Saxon 9.xx, XQSharp, AltovaXML. – Dimitre Novatchev Jan 15 '12 at 04:19
  • I am just working in the topic, that's why I have not answered yet. However, regarding you answer, do you refer to the web client xsltcake to do the tests? The processor I use in the code is firefox, I mainly use this web app to simplify the problem and, if I cannot solve, then post it here. – GWorking Jan 15 '12 at 20:42
  • Gerard, I wouldn't recommend xslcake to any friend -- it is beta, unreliable (may not be available at any moment -- and an elementary DOS attack will put it down for a long time), it uses different XSLT processors producing different results (depending on the browser that sends the request) and sometimes produces ridiculous results. The authors themselves warn their users about the state and usefulness of this app. I recommend using a good XSLT IDE, such as the XSelerator or at list Kernow. – Dimitre Novatchev Jan 15 '12 at 20:55
  • Actually, since my processor will be firefox, it is kind of appropiate that xsltcake uses the same (as long as I access it with firefox). But you're right about its alpha status. For me to have it available just without installing anything is a very useful feature. Perhaps are there more web services like it in a more mature state? – GWorking Jan 16 '12 at 16:21
  • No, and the main reason, I guess, is that this is an invitation to a DOS attack. – Dimitre Novatchev Jan 16 '12 at 17:18
  • I'm the author of this tool, this Dimitre fella really hates it :) I only set out to find something people could use to quickly play with XSLT and share solutions to problems. Not as a fully fledged IDE. The idea that a DOS attack could take it down as a reason not to use it is a tad absurd; any website could be open to such an attack. Best stop using Stackoverflow, eh? Also the choice to use the browser's processor was because we felt that opened up the choice as to what processor is used a bit and would be good for people who wanted to test javascript based transforms. – joshcomley Feb 16 '12 at 14:36
  • If we didn't use websites just because they were in alpha/beta then how is any new site going to mature?! I think there is an ulterior reason for being so avidly against the tool that is not being mentioned. – joshcomley Feb 16 '12 at 14:37

1 Answers1

2

The main problem in your code is that the key indexes prot elements, but what we want to de-duplicate (and need to index) is the node elements.

Here is a short and correct solution:

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="nodeByParentAndNum" match="node"
  use="concat(generate-id(..), '+', @num)"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*">
  <data>
   <xsl:apply-templates/>
  </data>
 </xsl:template>

 <xsl:template match=
 "node
   [not(generate-id()
       =
        generate-id(key('nodeByParentAndNum',
                        concat(generate-id(..), '+', @num)
                        )
                         [1]
                    )
       )
   ]
 "/> 
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<data>
    <prot seq="AAA">
        <node num="1">1345</node>
        <node num="1">11245</node>
        <node num="2">88885</node>
    </prot>
    <prot seq="BBB">
        <node num="1">678</node>
        <node num="1">456</node>
        <node num="2">6666</node>
    </prot>
    <prot seq="CCC">
        <node num="1">111</node>
        <node num="1">222</node>
        <node num="2">333</node>
    </prot>
</data>

the wanted, correct result is produced:

<data>
   <prot seq="AAA">
      <node num="1">1345</node>
      <node num="2">88885</node>
   </prot>
   <prot seq="BBB">
      <node num="1">678</node>
      <node num="2">6666</node>
   </prot>
   <prot seq="CCC">
      <node num="1">111</node>
      <node num="2">333</node>
   </prot>
</data>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431