XML Shredding Dynamic SQL

Question

Using the instructions from the 1st answer here, I'm trying to shred some XML in a SQL Server 2012 table that looks like this:

TableA

+------+--------+
|  ID  | ColXML |
+------+--------+
| 0001 | <xml1> |
| 0002 | <xml2> |
| ...  | ...    |
+------+--------+

xml1 looks like this:

<Attributes>
    <Attribute name="address1">301 Main St</Attribute>
    <Attribute name="city">Austin</Attribute>
</Attributes>

xml2 looks like this:

<Attributes>
    <Attribute name="address1">501 State St</Attribute>
    <Attribute name="address2">Suite 301</Attribute>
    <Attribute name="state">Texas</Attribute>
</Attributes>

There are varying numbers of attributes in any given row.

I'm trying to flatten it out into a relational table that looks like this:

+------+--------------+-----------+--------+-------+
|  ID  |   address1   | address2  |  city  | state |
+------+--------------+-----------+--------+-------+
| 0001 | 301 Main St  | NULL      | Austin | NULL  |
| 0002 | 501 State St | Suite 301 | NULL   | Texas |
+------+--------------+-----------+--------+-------+

Here's the code that I've tried that returns 0 rows in table #T:

select dense_rank() over(order by ID, I.N) as ID,
       F.N.value('(*:Name/text())[1]', 'varchar(max)') as Name,
       F.N.value('(*:Values/text())[1]', 'varchar(max)') as Value
into #T
from TableA as T
  cross apply T.Attributes.nodes('/ColXML') as I(N)
  cross apply I.N.nodes('ColXML') as F(N);

declare @SQL nvarchar(max)
declare @Col nvarchar(max);

select @Col = 
  (
  select distinct ','+quotename(Name)
  from #T
  for xml path(''), type
  ).value('substring(text()[1], 2)', 'nvarchar(max)');

set @SQL = 'select '+@Col+'
            from #T
            pivot (max(Value) for Name in ('+@Col+')) as P';

exec (@SQL);

Any help would be greatly appreciated.

based on your code you are looking for node ColXML which doesn't exist on your sample xml — Ed Bangga, Nov 06 '19 at 00:15
Thanks for the good question with sample data, own code and expectd output (+1 from my side). You might [refer to this very related question](https://stackoverflow.com/a/58706398/5089204). I'd suggest to read your values into an EAV-staging table and proceed from there. Otherwise you will have to shred your XLMs twice, once to create the dynamic statement and again to read it. This will be slow... — Shnugo, Nov 06 '19 at 07:36
Thank you for the link and for recommending an EAV-staging table, that's definitely helpful! — Dominic Armstrong, Nov 06 '19 at 15:28

Yitzhak Khabinsky · Answer 1 · 2019-11-06T01:16:35.270

Here is a DDL and XQuery to shred the XML in the table. No need for any dynamic SQL. MS SQL Server supports a subset of XQuery 1.0 standard. Microsoft needs to implement XQuery 3.1 to make its database even more powerful.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID VARCHAR(10) PRIMARY KEY, ColXML XML);
INSERT INTO @tbl (ID, ColXML)
VALUES
('0001', N'<Attributes>
    <Attribute name="address1">301 Main St</Attribute>
    <Attribute name="city">Austin</Attribute>
</Attributes>'),
('0002', N'<Attributes>
    <Attribute name="address1">501 State St</Attribute>
    <Attribute name="address2">Suite 301</Attribute>
    <Attribute name="state">Texas</Attribute>
</Attributes>');
-- DDL and sample data population, end

SELECT ID
    , col.value('(Attribute[@name="address1"]/text())[1]','VARCHAR(30)') AS [address1]
    , col.value('(Attribute[@name="address2"]/text())[1]','VARCHAR(30)') AS [address2]
    , col.value('(Attribute[@name="city"]/text())[1]','VARCHAR(30)') AS [city]
    , col.value('(Attribute[@name="state"]/text())[1]','VARCHAR(30)') AS [state]
FROM @tbl AS tbl
    CROSS APPLY tbl.ColXML.nodes('/Attributes') AS tab(col);

Output

+------+--------------+-----------+--------+-------+
|  ID  |   address1   | address2  |  city  | state |
+------+--------------+-----------+--------+-------+
| 0001 | 301 Main St  | NULL      | Austin | NULL  |
| 0002 | 501 State St | Suite 301 | NULL   | Texas |
+------+--------------+-----------+--------+-------+

Thanks a lot for that thorough reply! That code would work except that there are 100s of attributes/columns so it’s not practical to specify each one. — Dominic Armstrong, Nov 06 '19 at 00:28
Glad to hear that the proposed solution is working for you. Please don't forget to mark is as Answer. IMHO, it is still better than dynamic SQL. Easy to create and easy to maintain and support. — Yitzhak Khabinsky, Nov 06 '19 at 00:32

score 1 · Accepted Answer · answered Nov 06 '19 at 01:20

Here is a dynamic SQL solution based on my previous working proposal above. It emits the same output.

SQL

USE tempdb;
GO

DROP TABLE IF EXISTS #tbl;

-- DDL and sample data population, start
CREATE TABLE #tbl (ID VARCHAR(10) PRIMARY KEY, ColXML XML);
INSERT INTO #tbl (ID, ColXML)
VALUES
('0001', N'<Attributes>
    <Attribute name="address1">301 Main St</Attribute>
    <Attribute name="city">Austin</Attribute>
</Attributes>'),
('0002', N'<Attributes>
    <Attribute name="address1">501 State St</Attribute>
    <Attribute name="address2">Suite 301</Attribute>
    <Attribute name="state">Texas</Attribute>
</Attributes>');
-- DDL and sample data population, end

DECLARE @CrLf CHAR(2) = CHAR(13) + CHAR(10)
    , @sql VARCHAR(MAX) = 'SELECT ID ';

SET @sql += @CrLf;

;WITH rs AS
(
    SELECT DISTINCT col.value('@name','VARCHAR(30)') AS colName
    FROM #tbl AS tbl
        CROSS APPLY tbl.ColXML.nodes('/Attributes/Attribute') AS tab(col)
)
SELECT @sql += ', col.value(''(Attribute[@name="' + colName + '"]/text())[1]'',''VARCHAR(30)'') AS [' + colName + ']' + @CrLf 
FROM rs;

SET @sql += 'FROM #tbl AS tbl
    CROSS APPLY tbl.ColXML.nodes(''/Attributes'') AS tab(col);';

PRINT @sql;
EXEC(@sql);

Good answer, +1 from my side, although I'd avoid the *quirky update*... — Shnugo, Nov 06 '19 at 07:32
This one worked perfectly although the other answer is useful as well. Thank you so much for the help, I really appreciate it! — Dominic Armstrong, Nov 06 '19 at 15:29

XML Shredding Dynamic SQL

2 Answers2