50

I have this XML file, from which I'd like to count the number of users referenced in it. But they can appear in more than one category, and I'd like these duplicates not to be taken into account.
In the example below, the query should return 3 and not 4. Is there a way in XPath to do so? Users are not sorted at all.

<list>
  <group name='QA'>
    <user name='name1'>name1@email</user>
    <user name='name2'>name2@email</user>
  </group>
  <group name='DEV'>
    <user name='name3'>name3@email</user>
    <user name='name2'>name2@email</user>
  </group>
</list>
Antoine
  • 5,055
  • 11
  • 54
  • 82

6 Answers6

35

A pure XPath 1.0 -- one-liner:

Use:

count(/*/group/user[not(. = ../following-sibling::group/user)])

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • this will only work if the usernames are is order wont it? If this is the case (it's late and i've been drinking...) you could add [(not(. = ../following-sibling::group/user)) and (not(. = ../preceading-sibling::group/user))] – gingerbreadboy May 11 '10 at 22:21
  • @runrunraygun why test for not equal with everything? testing for not having equal following siblings takes just one value from a group of equal values -- exactly what is wanted. :) Thanks for your appreciation. – Dimitre Novatchev May 11 '10 at 23:09
  • 1
    @runrunraygun: what you propose: `[(not(. = ../following-sibling::group/user)) and (not(. = ../preceading-sibling::group/user))]` -- does not solve the problem at all. this will select only *unique values*, not *distinct values*. – Dimitre Novatchev May 12 '10 at 02:36
  • 1
    That seem to be working, now I have to understand what it does. Thx Dimitre. – Antoine May 12 '10 at 07:46
  • Where's the /*/ for? Found this solution overhere which is a bit more describing regarding checks on attributes: http://are.ehibou.com/xslt-xpath-selecting-distinct-nodes/ – riezebosch Oct 10 '11 at 11:46
  • @riezebosch: `/*` means *the top element* of the XML document -- regardless of its name. This is a convenient abbreviation for (in this case) `/list` – Dimitre Novatchev Oct 10 '11 at 12:31
  • *Sir* can this be shorten ? http://stackoverflow.com/questions/18601214/xpath-getting-multiple-p-tags-in-one-nodevalue/18601758#18601758 – Arup Rakshit Sep 03 '13 at 21:27
  • hi here is the shortest and efficient way //user[not(. = following::user/.)] – Raghavendra Oct 29 '13 at 13:37
  • 1
    @raghavendra, using `//` is the reverse of efficient solution ! – Dimitre Novatchev Oct 29 '13 at 14:15
  • @Dimitre calculate how times it will work. is the good one to this problem – Raghavendra Oct 29 '13 at 14:18
  • @raghavendra, for this problem it's OK, but not in general -- following a direct path can be factors of magnitude faster than traversing the whole document – Dimitre Novatchev Oct 29 '13 at 15:54
  • @Dimitre :) draw a rough diagram for that see which one will work betetr – Raghavendra Oct 30 '13 at 17:38
24

using the functions namespace http://www.w3.org/2005/xpath-functions you can use

distinct-values(//list/group/user)

UPDATE:

At the top of your xsl/xslt file you should have a stylesheet element, map the url above to the prefix fn as below...

<xsl:stylesheet version="1.0"
 xmlns:fn="http://www.w3.org/2005/xpath-functions"
 >

then you can use

select="fn:distinct-values(//list/group/user)"

this would assume you are doing this in templates and not in some xpathdocument object inwhich case you need to use a namespacemanager class.

links...

XSLT: Add namespace to root element

http://www.xqueryfunctions.com/xq/fn_distinct-values.html

http://msdn.microsoft.com/en-us/library/d6730bwt(VS.80).aspx

Otherwise try Dimitre Novatchev's answer.

Community
  • 1
  • 1
gingerbreadboy
  • 7,386
  • 5
  • 36
  • 62
  • 2
    The `distinct-values()` function is implemented only in XPath 2.0 and this means only in XSLT 2.0. The code in your answer doesn't work in XSLT 1.0. Please, correct. – Dimitre Novatchev May 12 '10 at 02:42
  • 2
    @Dimitre: OP didn't specify XSLT 1.0 so I think it's a bit rude to say "please, correct". This answer is correct already. – Matti Virkkunen Apr 25 '12 at 13:03
  • @MattiVirkkunen: The OP said in a comment that he was using "older version of .NET framework" -- this is synonymous (two years ago and at present) with XSLT 1.0. So, the answer isn't one that the OP can use to solve his problem. – Dimitre Novatchev Apr 25 '12 at 13:35
  • 4
    @Dimitre: He should've put it in the post, not some comment buried down the page nobody looks at. – Matti Virkkunen Apr 26 '12 at 10:43
  • @MattiVirkkunen: Agreed -- in an ideal world, yes. In the real world many askers have difficulty defining a problem, which just shows why they are having the problem, in the first place ... :) – Dimitre Novatchev Apr 26 '12 at 11:42
  • .NET never supported XPath 2.0 and it seems never will. But all this is not relevant, an XML question requires XML answer. Neither .NET, Java, Python and you name it and their corresponding versions would qualify as a good answer/comment. – Visar Apr 08 '15 at 11:31
4

I have a better answer

count(//user[not(. = following::user/.)])
Raghavendra
  • 3,530
  • 1
  • 17
  • 18
  • Note to readers: be careful; while this answers the question, it is possible you want uniqueness within a group, for that case you need to use one of the `sibling` axes. – TWiStErRob Oct 06 '18 at 19:12
1

Not sure if you could do it in XPath, but it could be done easily using System.Linq:

string xml = "<list><group name='QA'><user name='name1'>name1@email</user><user name='name2'>name2@email</user></group><group name='DEV'><user name='name3'>name3@email</user><user name='name2'>name2@email</user></group></list>";
        XElement xe = XElement.Parse(xml);
        int distinctCount = xe.Elements().Elements().Select(n => n.Value).Distinct().Count();

In this example, distinctCount will equal 3.

JSprang
  • 12,481
  • 7
  • 30
  • 32
0

You will need to use two functions like this.

count(distinct-values(//list/group/user))

First get the distinct values, then count them

Satish Sharma
  • 9,547
  • 6
  • 29
  • 51
Mitchel Sellers
  • 62,228
  • 14
  • 110
  • 173
  • 1
    Good, but have you tested this? Please, correct -- this is just a "minor" problem. :) – Dimitre Novatchev May 11 '10 at 17:01
  • 2
    Sellers: If "this works" then you have an incompliant XPath 2.0 engine! Obviously, you didn't test your proposed XPath expression at all! *Hint1*: There is no `COUNT()` function in XPath. *Hint2*: XPath is case-sensitive. – Dimitre Novatchev May 12 '10 at 02:40
0
count(//user[not(./@name = preceding::user/@name)])

I think the best way is to try to draw your xml data on paper to see how you can solve it easily

xmen-5
  • 1,806
  • 1
  • 23
  • 44
  • Note to readers: be careful; while this answers the question, it is possible you want uniqueness within a group, for that case you need to use one of the `sibling` axes. – TWiStErRob Oct 06 '18 at 19:12