I am attempting to extract some information from an XML column in SQL Server 2008. Each row in the table contains an XML document representing a report with controls for multiple people. Each person entry has a GroupStartControl, a PersonControl, a TextControl and a GroupEndControl. The GroupStartControl has a Guid attribute, the other 3 controls have a ParentId attribute that is the same as the Guid of the StartControl to which they belong. The issue is that the controls are matched to the start control using this ID in the front end of our system, but in the XML they are not grouped hierarchically so I'm struggling to extract the PersonId and Comment information for each individual person from it. I have tried using outer and cross apply and managed to get as far as the example below.
I've created a fiddle at http://sqlfiddle.com/#!3/25d9b/1 and the code to reproduce the issue I'm facing in SSMS is as follows:
DECLARE @SampleTable TABLE
(
ReportId INT,
PersonXml XML
)
INSERT @SampleTable (ReportId, PersonXml)
VALUES (1, '<Report>
<Controls>
<Control Guid="d77b7329-7974-4580-971c-d8c3fcdf5909" ControlType="GroupStartControl" />
<Control Guid="88ad2a34-89a6-46da-b62a-7b4ac4bbf68a" ParentId="d77b7329-7974-4580-971c-d8c3fcdf5909" ControlType="PersonControl">
<Fields>
<Field Label="PersonId" Value="12345" UniqueIdentifier="PersonId" />
<Field Label="Another field we do not care about" Value="" UniqueIdentifier="AnotherField" />
</Fields>
</Control>
<Control Guid="b54679d4-1cc1-4718-bec7-9f37ea00abe1" ParentId="d77b7329-7974-4580-971c-d8c3fcdf5909" ControlType="TextControl">
<Fields>
<Field Label="Comment" Value="Comments regarding person 12345" UniqueIdentifier="PersonComment" />
<Field Label="Some other field we do not need to examine" Value="" UniqueIdentifier="SomeOtherField" />
</Fields>
</Control>
<Control Guid="ec96950a-5bcc-455e-a0e4-40986f95ff37" ParentId="d77b7329-7974-4580-971c-d8c3fcdf5909" ControlType="GroupEndControl" />
<Control Guid="560a3d55-21c6-4f36-9430-f4a66d2affe3" ControlType="GroupStartControl" />
<Control Guid="acdf3332-d401-451c-aeaa-0547f752f0f0" ParentId="560a3d55-21c6-4f36-9430-f4a66d2affe3" ControlType="PersonControl">
<Fields>
<Field Label="PersonId" Value="7890" UniqueIdentifier="PersonId" />
<Field Label="Another field we do not care about" Value="" UniqueIdentifier="AnotherField" />
</Fields>
</Control>
<Control Guid="90b53f55-eb0b-498c-aa61-524e39db5634" ParentId="560a3d55-21c6-4f36-9430-f4a66d2affe3" ControlType="TextControl">
<Fields>
<Field Label="Comment" Value="Comments regarding person 7890" UniqueIdentifier="PersonComment" />
<Field Label="Some other field we do not need to examine" Value="" UniqueIdentifier="SomeOtherField" />
</Fields>
</Control>
<Control Guid="ec96950a-5bcc-455e-a0e4-40986f95ff37" ParentId="560a3d55-21c6-4f36-9430-f4a66d2affe3" ControlType="GroupEndControl" />
</Controls>
</Report>'
)
SELECT
m.c.value('@Value', 'varchar(max)') AS PersonId,
p.w.value('@Value', 'varchar(max)') AS Comment
FROM
@SampleTable XS
OUTER APPLY
XS.PersonXml.nodes('Report/Controls/Control[@ControlType="PersonControl"]/Fields/Field') as m(c)
OUTER APPLY
XS.PersonXml.nodes('Report/Controls/Control[@ControlType="TextControl"]/Fields/Field') as p(w)
WHERE
m.c.value('@UniqueIdentifier', 'varchar(max)') = 'PersonId'
AND m.c.value('@Value', 'varchar(max)') > 0
AND p.w.value('@UniqueIdentifier', 'varchar(max)') = 'PersonComment'
Running the above results in each comment being applied to each person, so I don't know which comment applies to which person in the real dataset (for the purposes of the example I've included the PersonId in the comment):
----------------------------------------------
- PersonId | Comment -
----------------------------------------------
- 12345 | Comments regarding person 12345 -
- 12345 | Comments regarding person 7890 -
- 7890 | Comments regarding person 12345 -
- 7890 | Comments regarding person 7890 -
----------------------------------------------
I'm trying to extract the information so that I get the person and their associated comment:
----------------------------------------------
- PersonId | Comment -
----------------------------------------------
- 12345 | Comments regarding person 12345 -
- 7890 | Comments regarding person 7890 -
----------------------------------------------
I'm not sure if this is even possible as I haven't done much work manipulating and extracting XML from within SQL.
I've read a few articles on MSDN, such as:
- https://technet.microsoft.com/en-us/library/ms175156(v=sql.100).aspx and
- https://msdn.microsoft.com/en-us/library/ms178030(v=sql.100).aspx
I have also looked at a few other questions on SO about extracting XML in SQL but most seem to pertain to hierarchical data or don't involve matching attributes to group sets of elements. e.g:
- SQL Server: Two-level GROUP BY with XML output
- Extracting Attributes from XML Fields in SQL Server 2008 Table
I'm not even sure if it is possible to query the XML in the way I want but I need to extract the information from within a SQL query, otherwise I would have written a C# application to extract what I need.