0

I have to shred an XML document every two weeks. It is compliant with the NEMSIS 2 standard and I have a compliant database built to hold the info. I've successfully built a query to import all of the one-to-one data, but I'm having trouble shredding the one-to-many data.

Using this demo data:

declare @x xml  
set @x =  
'<EMSDataSet>  
    <Header>  
    <D01_01>abc</D01_01>  
        <Record>  
            <E01>  
                <E01_01>12345</E01_01>  
                <E01_02>678</E01_02>  
            </E01>  
            <E02>  
                <E02_01>123</E02_01>  
                <E02_09>295</E02_09>  
                <E02_09>296</E02_09>  
            </E02>  
        </Record>
        <Record>
            <E01>
                <E01_01>67890</E01_01>
                <E01_02>678</E01_02>
            </E01>
            <E02>
                <E02_01>123</E02_01>
                <E02_09>295</E02_09>                
            </E02>
        </Record>  
    </Header>  
</EMSDataSet>'  

This query properly returns the correct value 01_01 and first value 02_09. However, since Value() can only return a singleton, I can only get the first value:

Query:  
select  
    t.c.value('(E01/E01_01)[1]','varchar(max)') e01_01  
   ,t.c.value('(E02/E02_09)[1]','varchar(max)') e02_09  
from @x.nodes('EMSDataSet/Header/Record') t(c)  

Returns:  
e01_01     e02_09  
------     ------  
 12345        295  

This query returns both value 02_09 values, in their own rows, but no corresponding value 01_01 values:

Query:  
select  
    t2.c.value('.','varchar(max)') e02_09  
from @x.nodes('EMSDataSet/Header/Record/E02/E02_09') t2(c)  

Returns:  
e02_09  
 -----  
   295  
   296  

What I need to accomplish is combine the results and pull both values of value 02_09 with value 01_01 repeated on each row. The value 01_01 will act as a foreign key in the database, referencing a specific, unique incident.

e01_01     e02_09  
------     ------  
 12345        295  
 12345        296  

It looks like it probably requires the use of the Nodes() method and possibly a Join, but I cannot figure out the syntax to get the single value 01_01 value repeated for each value 02_09 value.

These questions/answers have gotten me this far:

Retrieve-all-child-nodes-from-a-parent-node-xml-sql-server
Returning multiple rows from querying XML column in SQL Server 2008
T-SQL Shred Second Level XML Nodes into Multiple Rows
Shred XML For Each Row in SQL Table

EDIT: I attempted to following code after early success. However, instead of returning the correct 36 rows (35 records, with repeated for a double value of e02_09) from my test data set, it returned 1260 records (36 rows * 35 rows).

select *
from
    (
      select 
          t.c.value('(E01/E01_01)[1]','varchar(max)') e01_01
      from @x.nodes('EMSDataSet/Header/Record') t(c)
    ) a
    ,
    (
      select
          t2.c.value('.','varchar(max)') e02_09
      from @x.nodes('EMSDataSet/Header/Record/E02/E02_09') t2(c)
    ) b
dav
  • 190
  • 1
  • 8
  • 20
  • 1
    I usually use a programming language like c# to do the parsing. – jdweng Apr 17 '19 at 01:47
  • Unfortunately, that's not an option for me. I need to move the data straight from a file to SQL. Any thoughts are appreciated. – dav Apr 17 '19 at 19:58

1 Answers1

0

The best answer I've been able to achieve is based upon this blog post by Adam Machanic:

select 
    e01_01
    ,e02_09
from
    (select 
        dense_rank() over (order by b_node) unique_b_node
        ,c_node.value('./text()[1]','varchar(max)') e02_09

    from @x.nodes('/EMSDataSet/Header/Record') b(b_node)
    cross apply b.b_node.nodes('./E02/E02_09') c(c_node)
    ) a
join
    (select
        dense_rank() over (order by b2.b_node) unique_b_node
        ,c_node.value('text()[1]','varchar(max)') e01_01
    from @x.nodes('/EMSDataSet/Header/Record') b2(b_node)
    cross apply b2.b_node.nodes('./E01/E01_01') c2(c_node)
    ) b
on a.unique_b_node = b.unique_b_node
dav
  • 190
  • 1
  • 8
  • 20