And one more approach, which should be a little faster...
DECLARE @xml2 as XML ;
SET @xml2 = '<Student>
<Marks>
<Subject>Science</Subject>
<Score>89</Score>
<Subject>Maths</Subject>
<Score>90</Score>
</Marks>
</Student>';
WITH tally(Nmbr) AS(SELECT TOP(@xml2.value('count(/Student/Marks/Subject)','int')) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values)
SELECT tally.Nmbr
,@xml2.value('(/Student/Marks/Subject[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS [Subject]
,@xml2.value('(/Student/Marks/Score[sql:column("tally.Nmbr")]/text())[1]','int') AS Score
FROM tally;
The idea in short:
- We create a tally on the fly by using a computed TOP clause together with
ROW_NUMBER()
against any table with a larger row count (I use master..spt_values here, best was a physical numbers table...)
- Now we can grab each value by its position using
sql:column()
to get the tally's current value into the XQuery.
- This means: We pick the first Subject with the first Score. Than the second Subject with the second score and so on...
Hint: This format is very erronous. If this is under your control you really should change it. You are relying completely on the element's order and position. A missing element or any mix-up or other elements in between could tear this down to the ground.
I'd use something like
<Student>
<Marks Subject="Science" Score="80"/>
<Marks Subject="Maths" Score="90"/>
</Student>
or
<Student>
<Marks>
<Subject name="Science">80</Subject>
<Subject name="Maths">90</Subject>
</Marks>
</Student>
UPDATE Benchmark
The following will compare a XML with 10 / 100 / 1000 pairs in odd/even structure:
--Make sure to use a database, where this table returns at least 1000 rows (or use any other table)
SELECT COUNT(*) FROM master..spt_values
--Filling a table with dummy data
DECLARE @tbl TABLE(ID INT IDENTITY,[Subject] VARCHAR(30),Score VARCHAR(30));
INSERT INTO @tbl
SELECT TOP 1000 LEFT(CAST(NEWID() AS varchar(50)),30),CAST(CAST(NEWID() AS binary(4)) AS INT)
FROM master..spt_values;
SELECT * FROM @tbl;
--using three XMLs with different count of pairs
DECLARE @xml10 XML;
DECLARE @xml100 XML;
DECLARE @xml1000 XML;
SET @xml10=(
SELECT TOP 10
(SELECT [Subject] FOR XML PATH(''),TYPE) AS [*]
,(SELECT [Score] FOR XML PATH(''),TYPE) AS [*]
FROM @tbl t
ORDER BY t.ID
FOR XML PATH(''),ROOT('root')
);
SET @xml100=(
SELECT TOP 100
(SELECT [Subject] FOR XML PATH(''),TYPE) AS [*]
,(SELECT [Score] FOR XML PATH(''),TYPE) AS [*]
FROM @tbl t
ORDER BY t.ID
FOR XML PATH(''),ROOT('root')
);
SET @xml1000=(
SELECT TOP 1000
(SELECT [Subject] FOR XML PATH(''),TYPE) AS [*]
,(SELECT [Score] FOR XML PATH(''),TYPE) AS [*]
FROM @tbl t
ORDER BY t.ID
FOR XML PATH(''),ROOT('root')
);
--test for 10
DECLARE @d DATETIME2=SYSUTCDATETIME();
WITH tally(Nmbr) AS(SELECT TOP(@xml10.value('count(/root/Subject)','int')) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values)
SELECT tally.Nmbr
,@xml10.value('(/root/Subject[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS [Subject]
,@xml10.value('(/root/Score[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS Score
INTO #t10a
FROM tally;
SELECT 'xml10 a',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());
SET @d=SYSUTCDATETIME();
SELECT c.value('(./text())[1]', 'nvarchar(max)') AS [Subject]
, c.value('(/root/*[sql:column("w.r")]/text())[1]', 'nvarchar(max)') AS [Score]
INTO #t10b
FROM @xml10.nodes('/root/*[position() mod 2 = 1]') AS t(c)
CROSS APPLY (SELECT t.c.value('let $n := . return count(/root/*[. << $n[1]]) + 2','INT') AS r
) AS w;
SELECT 'xml10 b',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());
--test for 100
SET @d =SYSUTCDATETIME();
WITH tally(Nmbr) AS(SELECT TOP(@xml100.value('count(/root/Subject)','int')) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values)
SELECT tally.Nmbr
,@xml100.value('(/root/Subject[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS [Subject]
,@xml100.value('(/root/Score[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS Score
INTO #t100a
FROM tally;
SELECT 'xml100 a',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());
SET @d=SYSUTCDATETIME();
SELECT c.value('(./text())[1]', 'nvarchar(max)') AS [Subject]
, c.value('(/root/*[sql:column("w.r")]/text())[1]', 'nvarchar(max)') AS [Score]
INTO #t100b
FROM @xml100.nodes('/root/*[position() mod 2 = 1]') AS t(c)
CROSS APPLY (SELECT t.c.value('let $n := . return count(/root/*[. << $n[1]]) + 2','INT') AS r
) AS w;
SELECT 'xml100 b',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());
--test for 1000
SET @d =SYSUTCDATETIME();
WITH tally(Nmbr) AS(SELECT TOP(@xml1000.value('count(/root/Subject)','int')) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values)
SELECT tally.Nmbr
,@xml1000.value('(/root/Subject[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS [Subject]
,@xml1000.value('(/root/Score[sql:column("tally.Nmbr")]/text())[1]','nvarchar(max)') AS Score
INTO #t1000a
FROM tally;
SELECT 'xml1000 a',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());
SET @d=SYSUTCDATETIME();
SELECT c.value('(./text())[1]', 'nvarchar(max)') AS [Subject]
, c.value('(/root/*[sql:column("w.r")]/text())[1]', 'nvarchar(max)') AS [Score]
INTO #t1000b
FROM @xml1000.nodes('/root/*[position() mod 2 = 1]') AS t(c)
CROSS APPLY (SELECT t.c.value('let $n := . return count(/root/*[. << $n[1]]) + 2','INT') AS r
) AS w;
SELECT 'xml1000 b',DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME());
Method a is my approach using a tally, method b is Yitzhak's approach using XQuery.
The difference between these two approaches is rather small
10 Elements a=7ms / b=6ms
100 Elements a=83ms / b=79ms
1000 Elements a=8942ms / b=8721ms
Some general differences:
- The tally-approach would work with triples or more elements per serie as well.
- The tally approach would still work with other elements in between
- the XQuery approach would deal better with unexpectedly missing elements, but both approaches would not return correctly, if just one expected element was missing.