5

Having answered a large number of XSLT questions here on Stack Overflow, I am more than familiar with the Muenchian grouping technique for grouping nodes during an XSL transformation.

The expression used therein is usually something like this:

*[generate-id() =
  generate-id(key('kSomeKey', .)[1])]

It almost invariably contains that [1], but this has always struck me as odd.

The XSLT 1.0 spec defines generate-id() as follows:

The generate-id function returns a string that uniquely identifies the node in the argument node-set that is first in document order.

(emphasis added)

It clearly states that the function operates on the first node in document order, and in this context, the [1] would be selecting the first node in the set in document order, so it seems that the [1] is redundant.

This [1] is used so widely that I am hesitant to omit it, but it seems extraneous. Can anyone clear this up for me?

JLRishe
  • 99,490
  • 19
  • 131
  • 169
  • 2
    It would be necessary in XSLT 2.0, as `generate-id` does not allow a node-set as the first argument in that, and throws an error. Of course, if you were using XSLT 2.0, you would be more likely to use `xsl:for-each-group` rather than Muenchian Grouping. – Tim C Dec 19 '14 at 09:40

2 Answers2

4

I would recommend always using an explicit "[1]" rather than exploiting the fact that operations in XPath 1.0 do it implicitly. For two reasons: it improves the readability of your code, and it makes it compatible with XPath 2.0. There may be processors where it gives a performance benefit, but I wouldn't speculate on that until it's been proved by measurement.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Thanks! (+1) I'm not sure I agree that it improves readability, but interoperability is always a good objective to have. – JLRishe Dec 22 '14 at 00:45
  • @JLRishe to someone who is less intimately familiar with the XPath and XSLT specs I would argue the `[1]` does improve readability. It makes it crystal clear that you want to check whether this node is the first one in its group, and makes it obvious what to tweak if you wanted to check for the second/third/last in the group instead of the first. – Ian Roberts Mar 20 '15 at 10:44
2

Semantically the [1] is not necessary but depending on the (lack of) optimization in the XSLT processor it might be more efficient to have it. It will depend on the internals of each XSLT processor whether key('key-name', foo)[1] only computes one node or first computes a node-set with all nodes selected by the key and then takes the first as much as it depends on the XSLT processor to recognize generate-id(key('key-name', foo)) as an expression where only the first node in the node-set computed by the key is needed.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110