1

Using SQL Server 2012. I have a table containing a group identifier and small snippets of XML. Here is a sample:

CREATE TABLE temp (
    GroupId [int] NOT NULL,
    RawXml [xml] NULL
)

INSERT INTO temp VALUES (1, '<Item><Criteria Type="State" Values="CA"/><Criteria Type="State" Values="TX"/><Criteria Type="State" Values="FL"/></Item>')
INSERT INTO temp VALUES (1, '<Item><Criteria Type="Manager" Values="Tim"/></Item>')
INSERT INTO temp VALUES (2, '<Item><Criteria Type="Name" Values="Walters"/></Item>')
INSERT INTO temp VALUES (2, '<Item><Criteria Type="Manager" Values="Tim"/></Item>')

What I want to do is group the snippets of XML together by the GroupId to form a larger XML document so the end result is structured like this:

<Parent>
    <Group GroupId="1">
        <Item>
            <Criteria Type="State" Values="CA"/>
            <Criteria Type="State" Values="TX"/>
            <Criteria Type="State" Values="FL"/>
        </Item>
        <Item>
            <Criteria Type="Manager" Values="Tim"/>
        </Item>
    </Group>
    <Group GroupId="2">
        <Item>
            <Criteria Type="Name" Values="Walters"/>
        </Item>
        <Item>
            <Criteria Type="Manager" Values="Tim"/>
        </Item>
    </Group>
</Parent>

I fear the only solution is to concatenate the contents of the XML column first using some type of string concatenation method, and then using a FOR XML select statement, but I feel this will be too slow for my application. Am I missing something and there is an easy way to group xml data in this manner, or is string concatenation my only choice.

Community
  • 1
  • 1
Stigz
  • 53
  • 5
  • Why are splitting the xml put the entire xml into one single row. It will be easy for parsing – Pரதீப் Aug 24 '16 at 02:00
  • You're preaching to the choir. Truthfully, I'm bound by some legacy application code/data and these are the cards I was dealt. Doing a conversion of how the data is already being saved is an option, but I'm trying to avoid it at the moment. – Stigz Aug 24 '16 at 02:05
  • This is a good question! Especially the test code, copy'n'pasteable and the expected output are really helpful! +1 – Shnugo Aug 24 '16 at 07:47

1 Answers1

2

Of course there is a simpler way:

declare @t table (
    GroupId [int] NOT NULL,
    RawXml [xml] NULL
);

INSERT INTO @t
VALUES (1, '<Item><Criteria Type="State" Values="CA"/><Criteria Type="State" Values="TX"/><Criteria Type="State" Values="FL"/></Item>'),
(1, '<Item><Criteria Type="Manager" Values="Tim"/></Item>'),
(2, '<Item><Criteria Type="Name" Values="Walters"/></Item>'),
(2, '<Item><Criteria Type="Manager" Values="Tim"/></Item>');

select sq.GroupId as [@GroupId],
    (
        select t.RawXml as [node()]
        from @t t
        where t.GroupId = sq.GroupId
        for xml path(''), type
    )
from (select distinct g.GroupId from @t g) sq
for xml path('Group'), type, root('Parent');

(I have replaced your original table with the equivalent TV because, well, I don't like to clean up databases of such tables afterwards. It doesn't introduce any difference with regards to the question.)

Roger Wolf
  • 7,307
  • 2
  • 24
  • 33
  • Good answer, +1, just one tiny hint: I'm not sure, but I think it was a little bit better to use `AS [*]` or `AS [node()]` better than `.query()` to avoid the additional nesting level – Shnugo Aug 24 '16 at 07:57
  • @Shnugo, I forgot how to specify this particular output, exactly :) Indeed, according to execution plan, the difference is huge. Corrected, thanks for suggestion. – Roger Wolf Aug 24 '16 at 08:27
  • This is great. Thank you. Yes, `t.RawXml as [node()]` makes it much faster. – Stigz Aug 24 '16 at 14:48