0

I have the following XML data, which I have zero control over. Note that it's basically a collection of property groups. I need to select the value of one property where the value of another property is 'true'. The problem is, there's nothing to group on, and I cannot figure out how to associate things correctly.

Here's the XML Data, and the query I came up with so far:

DECLARE @xml xml = '
<Container>
  <Collection>
    <ItemName>SomeItem</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>false</IsPersisted>
  </Collection>
  <Collection>
    <ItemName>AnotherItem</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>true</IsPersisted>
    <ExistsInDB>true</ExistsInDB>
  </Collection>
  <Collection>
    <ItemName>ItemFoo</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>true</IsPersisted>
    <ExistsInDB>true</ExistsInDB>
  </Collection>
  <Collection>
    <ItemName>BarBazItem</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>true</IsPersisted>
    <ExistsInDB>false</ExistsInDB>
  </Collection>
</Container>
'

;WITH XmlStuff AS (
    SELECT CAST(xmlShredded.colXmlItem.query('local-name(.)') AS nvarchar(4000)) as XmlNodeName,
        xmlShredded.colXmlItem.value('.', 'nvarchar(4000)') AS XmlNodeValue
    FROM @xml.nodes('/*/Collection/child::node()') as xmlShredded(colXmlItem) 
)
SELECT *
FROM XmlStuff 

Now, what I need to do, is get the "ItemName" value for each grouping where "ExistsInDB" is 'true'. Note that the "ExistsInDB" property doesn't exist in the first property collection (i.e. it should be considered NULL/false).

So in this case, I need to query this xml data, and get back the following set:

AnotherItem ItemFoo

I should NOT get "SomeItem" or "BarBazItem".

I've been beating my head against the desk trying to figure out how to formulate the query for "Get all ItemName values where the associated ExistsInDB value is both present, and true".

Is this even possible?

pmbAustin
  • 3,890
  • 1
  • 22
  • 33

3 Answers3

2

Hi perhaps try a sibling match

/Container/Collection/ItemName[../ExistsInDB='true']

That gets ItemName elements whose parents contain an ExistsInDb child equal to "true".

DECLARE @xml xml = '
<Container>
  <Collection>
    <ItemName>SomeItem</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>false</IsPersisted>
  </Collection>
  <Collection>
    <ItemName>AnotherItem</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>true</IsPersisted>
    <ExistsInDB>true</ExistsInDB>
  </Collection>
  <Collection>
    <ItemName>ItemFoo</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>true</IsPersisted>
    <ExistsInDB>true</ExistsInDB>
  </Collection>
  <Collection>
    <ItemName>BarBazItem</ItemName>
    <IsDeletable>true</IsDeletable>
    <IsPersisted>true</IsPersisted>
    <ExistsInDB>false</ExistsInDB>
  </Collection>
</Container>
';

SELECT node.value('.', 'nvarchar(100)')
FROM @xml.nodes('/Container/Collection/ItemName[../ExistsInDB="true"]') AS x(node) 
Salman A
  • 262,204
  • 82
  • 430
  • 521
jspcal
  • 50,847
  • 7
  • 72
  • 76
2

Your expected output is a bit irritating, as you mention BarBazItem twice

So in this case, I need to query this xml data, and get back the following set:

AnotherItem ItemFoo BarBazItem

I should NOT get "SomeItem" or "BarBazItem".

Both existing answers are using backward-navigation (ther parent-axis via ../). This is very bad performing (find details here) and not the best approach for you. Try this:

SELECT nd.value('(ItemName/text())[1]','nvarchar(100)') AS ItemName
FROM @xml.nodes('/Container/Collection[ExistsInDB/text()="true"]') A(nd);

The idea is, to place the predicate one level up behind <Collection>. You can read this as

  • Dive into <Container>
  • Dive deeper into <Collection>, but we want only nodes, where there is a sub-node <ExistsInDB> with a text() of "true".
  • The list returned by A will include one row per <Collection> fullfilling the predicate (=filter)
  • the .value()-method will now take at the first level below <Collection> the <ItemName> and return the text()-node.

My query returns AnotherItem and ItemFoo

SomeItem is not returned as there is no <ExistsInDB> and BarBazItem is not returned as there is "false" in <ExistsInDB>.

UPDATE

Lukasz Szoda added a fiddle with some enhancements. I think the easiest to read the whole everything was this:

SELECT nd.value('(ItemName/text())[1]','nvarchar(100)') AS ItemName
      ,nd.value('(IsDeletable/text())[1]','bit') AS IsDeletable
      ,nd.value('(IsPersisted/text())[1]','bit') AS IsPersisted
FROM @xml.nodes('/Container/Collection[ExistsInDB/text()="true"]') A(nd);
Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • I extended it a bit to match OP's requirements: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=7abfb3c1254f79078efffdff3ba4b12d – Lukasz Szozda May 04 '19 at 17:35
  • @LukaszSzozda Thx for adding to this! Your approach will read the sub nodes to *key-value-pairs*. I do not see the need for this. The OP seems to know the element's names in advance. Furthermore, the OP does not state, that the other values are of interest too. However, `A.nd` carries the full XML-node `` with all internal values. it seems easier to read them directly. – Shnugo May 05 '19 at 12:16
  • @Shungo I was thinking about the usage of `child::node()` by OP. That is why I added my demo instead of explicitly naming all columns. Nevertheless, your answer should be accepted one :) – Lukasz Szozda May 05 '19 at 12:17
  • Apologies, the "BarBazItem" was in there twice from a previous edit. When I updated to make something more clear, I forgot to remove the other instance. I have updated the main question. This extra info is very helpful to me personally in understanding, and I'm very thankful for it, even if it's not directly applicable to the exact problem I was having. It's very much appreciated, and probably saved me from having to post another question soon. – pmbAustin May 06 '19 at 15:44
  • One additional question while I'm here: What if the "/Container" node has different names in different situations, and I just want to say "whatever here"... everything blow that is fixed, so I'd like to say something like "/*/Collection", can I do that easily? – pmbAustin May 06 '19 at 18:19
  • 1
    @pmbAustin You can use `/*/Collection` or `//Collection`. The first means, an element Collection in the second level, the second searches for an element Collection **anywhere** in the xml. – Shnugo May 06 '19 at 18:26
  • Thank you! Very helpful. – pmbAustin May 06 '19 at 18:39
1

You could use:

;WITH XmlStuff AS (
    SELECT CAST(xmlShredded.colXmlItem.query('local-name(.)') AS nvarchar(4000)) as XmlNodeName,
        xmlShredded.colXmlItem.value('.', 'nvarchar(4000)') AS XmlNodeValue,
        xmlShredded.colXmlItem.value('(../ExistsInDB)[1]', 'nvarchar(4000)') AS ExistsInDB
    FROM @xml.nodes('/*/Collection/child::node()') as xmlShredded(colXmlItem) 
)
SELECT *
FROM XmlStuff 
WHERE ExistsInDB = 'true';

db<>fiddle demo

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • 1
    Almost... need to add "XmlNodeName = 'ItemName' AND " to the beginning of that WHERE clause, but yes, this is it! Thank you. I'm going to have to study that a bit to be sure I really understand what's going on though, but it does seem to work just fine. – pmbAustin May 03 '19 at 21:19
  • @pmbAustin Sure when you use `../` you move one level up – Lukasz Szozda May 03 '19 at 21:20
  • 1
    This is a very bad performing approach. One the one hand, you use backward navigation `../`, on the other hand you will have to return the full set of all nodes below ``, just to throw most of them away in the end. You might want to look into my answer... – Shnugo May 04 '19 at 11:26
  • @Shnugo Thanks for pointing performance perspective. I agree with your answer completely :) – Lukasz Szozda May 04 '19 at 17:25