1

Given this XML:

DECLARE @input XML = '<Data>
  <Tab Name="UserData">
    <Row>
      <BeneficialOwnerName>Fred Flintstone</BeneficialOwnerName>
      <TaxID>111-11-1111</TaxID>
    </Row>
    <Row>
      <BeneficialOwnerName>Barney Rubble</BeneficialOwnerName>
      <TaxID>222-22-2222</TaxID>
    </Row>
  </Tab>
  <Tab Name="OtherData">
    <Row>
      <BeneficialOwnerName>Bugs Bunny</BeneficialOwnerName>
      <TaxID>333-33-3333</TaxID>
    </Row>
    <Row>
      <BeneficialOwnerName>Road Runner</BeneficialOwnerName>
      <TaxID>444-44-4444</TaxID>
    </Row>
  </Tab>  
</Data>'

I'd like to return this:

Name     BeneficialOwnerName    TaxID
UserData  Fred Flintstone       111-11-1111
UserData  Barney Rubble         222-22-2222
OtherData Bugs Bunny            333-33-3333
OtherData Road Runner           444-44-4444

My SQL looks like this:

SELECT
  Name = XCol.value('@Name','varchar(25)'),   
  BeneficialOwnerName = XCol.value('BeneficialOwnerName[1]','varchar(25)'),
  TaxID = XCol.value('TaxID[1]','varchar(25)')
FROM 
  @input.nodes('/Data/Tab/Row') AS XTbl(XCol)

But depending on how I tweak the FROM clause I get either two rows of Tab names or 4 rows of data with the tab name Null

What do I need to do to get the data shown?

Thanks

Carl

CarlGanz
  • 189
  • 11
  • 1
    This is a good question: *copy-n_paste* test code, expected output, own effort... +1 from my side – Shnugo Jun 29 '17 at 07:52

2 Answers2

2

From the performance viewpoint, accessing parent element tends to be more expensive than selecting each level separately. The code below is slightly faster for your example, but will make a noticeable difference on a bigger XML:

SELECT
  Name = t.c.value('@Name','varchar(25)'),   
  BeneficialOwnerName = r.c.value('BeneficialOwnerName[1]','varchar(25)'),
  TaxID = r.c.value('TaxID[1]','varchar(25)')
FROM @input.nodes('/Data/Tab') t(c)
    cross apply t.c.nodes('./Row') r(c);
Roger Wolf
  • 7,307
  • 2
  • 24
  • 33
  • 1
    Good answer, +1 from my side, just one tiny hint: If you want the best performance, it is better to use `(TaxID/text())[1]` or - even slightly faster - `TaxID[1]/text()[1]`. You might [read this answer](https://stackoverflow.com/a/43242238/5089204) why it is faster to use `text()`. – Shnugo Jun 29 '17 at 07:50
  • 1
    @Shnugo, technically yes, and there is a difference between the two. However, even with the plans being different, the performance difference is negligible on this sample. One would need either a much bigger or a much more convoluted XML to notice this. – Roger Wolf Jun 29 '17 at 08:45
1

Everything was fine except for Name column. Try this:

SELECT
  Name = XCol.value('../@Name','varchar(25)'),   
  BeneficialOwnerName = XCol.value('BeneficialOwnerName[1]','varchar(25)'),
  TaxID = XCol.value('TaxID[1]','varchar(25)')
FROM 
  @input.nodes('/Data/Tab/Row') AS XTbl(XCol)
Dean Savović
  • 739
  • 3
  • 7