5

I am trying to get indexes of xml nodes while selecting values using sql:

This is the code :

declare @myxml xml ='
<Departments>
    <Department>
        <Employee>
            A
        </Employee>
        <Employee>
            B
        </Employee>
    </Department>

    <Department>
        <Employee>
            C
        </Employee>
        <Employee>
            D
        </Employee>
    </Department>
</Departments>'

Select Emp = m.value('.','varchar(30)') 

from @myxml.nodes('Departments/Department/Employee') X(m)

Output of above query :

Emp 
A
B
C
D

Expected Output :

Emp  Department_Index

A         1
B         1
C         2
D         2

i.e. I want index of department corresponding to every employee under department. here Employee A and B belong to 1st department of departments and employee C and D belong to 2nd department of departments.

So I want this to join complex XML childs to that with parent where no unique key is present.

Shnugo
  • 66,100
  • 9
  • 53
  • 114

1 Answers1

5

This is a solution, where I'd claim it will work in all scenarios - although it is not guaranteed that the number bound to a <Department> node via ROW_NUMBER() will reflect its real position in each and any situation (see remarks and links below):

declare @myxml xml ='
<Departments>
    <Department>
        <Employee>
            A
        </Employee>
        <Employee>
            B
        </Employee>
    </Department>

    <Department>
        <Employee>
            C
        </Employee>
        <Employee>
            D
        </Employee>
    </Department>
</Departments>';

--The query will use a CTE to bind numbers to the first level and pass the inner node in whole. The final SELECT will use the passed over <Department> node and pick its employees:

WITH NumberedDepartment AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS DepInx
          ,d.query(N'.') AS DepartmentNode
    FROM @myxml.nodes(N'/Departments/Department') AS A(d)
)
SELECT DepInx 
      ,e.value(N'text()[1]','nvarchar(max)') AS Employee
FROM NumberedDepartment
CROSS APPLY DepartmentNode.nodes(N'Department/Employee') AS B(e);

If you want to read about the guarantee of sort order you might read this thread. Especially readworthy: The chat below John Cappellettis answer. Over there I provide an approach using XQuery, and one more approach using a tally / numbers table to pick the elements at their position. But this is rather complicated and slow.

UPDATE: An approach with a guaranteed sort order

This approach will create a tally table on-the-fly. If you have a numbers table this was even better...

The TOP clause will limit this tally to the actual count of <Department> nodes. Be sure to use a source table (I use master..spt_values) which has at least as many rows as you might need.

The first apply will use .query() together with sql:column() to get the right department node for each number. The second apply will read the related employees.

WITH Tally(Nmbr) AS
(
    SELECT TOP (SELECT @myxml.value(N'count(/Departments/Department)','int'))
           ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) 
    FROM master..spt_values --just a pre-filled table with many rows
)
SELECT Nmbr
      ,e.value(N'text()[1]','nvarchar(max)') AS Employee
FROM Tally
OUTER APPLY(SELECT @myxml.query(N'/Departments/Department[sql:column("Nmbr")]')) AS A(ds)
OUTER APPLY ds.nodes(N'Department/Employee') AS B(e);
Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • In what scenarios this will not work ? Because in my case , the XML consists of about 200k lines of code. So as you mentioned that other approaches are slow , this might give performance issue . So if I can be sure that this will not fail for our scenarios , we can use this only. By the way .... Thanks for an awesome answer . –  Jul 05 '18 at 14:47
  • @JFI I'd claim it will work in all scenarios. But `ROW_NUMBER() OVER(ORDER BY(SELECT NULL))` could, might, should not, ought to - uhm - however... The sort order is not guaranteed by the official specs, so it *might* be, that the first node gets the "2" while the second node gets the "1". In the other thread I've presented approaches with a guaranteed order, but - as written - this is rather complicated... – Shnugo Jul 05 '18 at 14:52
  • @JFI see my update and let me know how fast this is in your scenario. – Shnugo Jul 05 '18 at 15:06
  • I am using same script in our code , but somehow it is giving wrong results :( I am opening another question , Please refer to new question : https://stackoverflow.com/questions/51196366/xml-to-sql-issue-how-to-get-position-of-an-element –  Jul 05 '18 at 16:45