3

I have implemented a tree structure query mechanism using sql and an entity value attribute based database design. I wanted to see the performance of the same functionality with an XQuery based approach, assuming it would be possible to use XQuery for the task. The simplified form of my tree (XLM document) is as follows: enter image description here

There are different types of nodes, but the only attribute that I'm using in the query is the archetype_node_id attribute of the node. The test query I've attempted to write aims to select the Evaluation node (on the right) with 2 element nodes. The query implementation requires two key capabilities from the language used: the ability to support structural definitions (with boolean operators), and the ability to define constraints for attributes of nodes (xml attributes in this case).

With XQuery, I have two problems 1) I can't seem to manage to declare references to all nodes of interest, that is any node I'm interested in in the graph 2) I can't figure out how to return the matches, since the matches for the right hand side of this tree would have one composition with an evaluation which in turn has two elements.

Here is my first, naive attempt using FLWR:

    for     $composition in doc("composition-visit.xml")//element()
let    $evaluation := (
                        for $evalsneeded in $composition//element() 
                        let $elementat02 := 
                                            (for $el02 in $evalsneeded//element() 

                                             where $el02/@archetype_node_id = 'at0002'
                                                  and exists($evalsneeded//$el02)
                                             return  $el02
                                            ),
                            $elementat03 := 
                                        (for $el03 in $evalsneeded//element() 

                                         where $el03/@archetype_node_id = 'at0003'
                                                and exists($evalsneeded//$el03)
                                         return  $el03
                                        )
                        where $evalsneeded/@archetype_node_id = 'openEHR-EHR-EVALUATION.goal.v1'
                                 and 
                                    exists ($evalsneeded//$elementat02)
                                     and
                                     exists ($evalsneeded//$elementat03)

                        return $evalsneeded)
where $composition/@archetype_node_id = 'openEHR-EHR-COMPOSITION.encounter.v1'                
        and exists($composition//$evaluation)


return $evaluation/@archetype_node_id/string(.)

My problem is I end up pushing evaluation and element nodes to subqueries, since filtering based on their attribute values and location does not work if I introduce them as global variables in the main FLOWR body.

I am even more clueless when it comes to returning the results, but I did not want to ask a separate question for that.

Ideally, when I enforce an AND constraint for an Evaluation having elements both with at0002 and at0003 codes, I should get the right hand side of the tree, and if I use an OR constraint for the same elements, I should get the whole tree.

Is this doable with XQuery? It works as a test of the existence of the structure I'm looking for in the tree, but I also want to access individual nodes.

Update: here is my second attempt. This one actually opens the door to what I've been trying to do, but I'm not sure if this is the right way of doing this in XQuery. Should I ask another question for improving this approach? :

    <result>
{
    for     $composition in doc("composition-visit.xml")//element() 

    where $composition/@archetype_node_id = 'openEHR-EHR-COMPOSITION.encounter.v1'                


    return <composition>
                <name>{$composition/name/value/string(.)}</name>
                <evaluation>{for $eval in $composition//element()
                             let $el1 := (for $el1_in_eval in $eval//element()
                                            where $el1_in_eval/@archetype_node_id = 'at0002'
                                            return $el1_in_eval ),
                                 $el2 := (for $el2_in_eval in $eval//element()
                                            where $el2_in_eval/@archetype_node_id = 'at0003'
                                            return $el2_in_eval )

                                     where $eval/@archetype_node_id = 'openEHR-EHR-EVALUATION.goal.v1'
                                            and
                                            (exists($el1)
                                            and
                                            exists($el2)
                                            )
                                     return <eval>
                                                   <name>{$eval/name/value/string(.)}</name>
                                                   <element1>{for $element1 in $eval//element()
                                                             where $element1/@archetype_node_id = 'at0002'

                                                             return $element1}</element1>
                                                     <element2>{for $element2 in $eval//element()
                                                     where $element2/@archetype_node_id = 'at0003'

                                                     return $element2}</element2>
                                           </eval>
                            }</evaluation>
            </composition>
}
</result>

Basically, I enforce the parent/child relations with let statements, and use return to get values of corresponding matches for let, which in turn may do the same down the tree.

GavinBrelstaff
  • 3,016
  • 2
  • 21
  • 39
mahonya
  • 9,247
  • 7
  • 39
  • 68
  • 1
    If the tree is a binary search tree, this has been implemented using XQuery. See this post: http://dnovatchev.wordpress.com/2012/01/09/the-binary-search-tree-data-structurehaving-fun-with-xpath-3-0/ – Dimitre Novatchev Dec 24 '12 at 01:48
  • 1
    This may be too late to be of use... but... it looks like you're implementing abstract trees on top of concrete trees that are already defined in terms of abstract trees. Just use element trees directly instead of implementing trees with XML elements... – barefootliam Apr 23 '13 at 18:27

2 Answers2

1

It looks like your use case is querying "archetyped" openEHR data.

Feel free to have a look at the open source https://github.com/LiU-IMT/EEE that is using xQuery for requests similar to your use-case, but with the data modelled a bit differently.

It is used e.g. in the paper http://www.ep.liu.se/ecp/070/009/ecp1270009.pdf where you can find a query example that returns all record ids that had a histological exam result indicating neoplastic lesions between 2006-01-01 and 2006-05-01.

In AQL (Archetype Query Language) it is expressed as...

SELECT e/ehr_id/value as ehr_id
FROM Ehr e
CONTAINS VERSION v
CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.histologic_exam.v1]
CONTAINS OBSERVATION obs [openEHR-EHR-   OBSERVATION.histological_exam_result.v1]
WHERE (EXISTS obs/data[at0001]/events[at0002]/data[at0003]/items[at0085]/items[at0033]/items[at0034] 
OR
EXISTS obs/data[at0001]/events[at0002]/data[at0003]/items[at0085]/items[at0033]/items[at0035])
AND c/context/start_time/value >= '2006-01-01T00:00:00,000+01:00'
AND c/context/start_time/value < '2006-05-01T00:00:00,000+01:00'`

...which when automatically parsed and translated to XQuery that looks like this:

declare namespace v1 = "http://schemas.openehr.org/v1";
declare default element namespace "http://schemas.openehr.org/v1";
declare namespace xsi = "http://www.w3.org/2001/XMLSchema-instance";
declare namespace eee = "http://www.imt.liu.se/mi/ehr/2010/EEE-v1.xsd";
declare namespace res = "http://www.imt.liu.se/mi/ehr/2010/xml-result-v1#";
<res:xml-results>
<res:head><res:variable name="ehr_id"/></res:head>
<res:results>
 {let $ehrRoot := //eee:EHR
  for $e in $ehrRoot
  for $v in $e/eee:versioned_objects/eee:versions
  for $c in $v//*[@xsi:type='v1:COMPOSITION' and @archetype_node_id="openEHR-EHR-COMPOSITION.histologic_exam.v1"]
  for $obs in $c//*[@xsi:type='v1:OBSERVATION' and @archetype_node_id= "openEHR-EHR-OBSERVATION.histological_exam_result.v1"]
  where
   (
    exists($obs/data[@archetype_node_id = 'at0001']/events[@archetype_node_id = 'at0002']/data[@archetype_node_id='at0003']/items[@archetype_node_id = 'at0085']/items[@archetype_node_id = 'at0033']/items[@archetype_node_id = 'at0034'])
   or
    exists($obs/data[@archetype_node_id = 'at0001']/events[@archetype_node_ id = 'at0002']/data[@archetype_node_id = 'at0003']/items[@archetype_node_id = 'at0085']/items[@archetype_node_id = 'at0033']/items[@archetype_node_id = 'at0035'])
   )
   and
    $c/context/start_time/value >= '2006-01-01T00:00:00,000+01:00' 
   and 
    $c/context/start_time/value < '2006-05-01T00:00:00,000+01:00'
return
<res:result><res:binding name="ehr_id">{$e/eee:ehr_id/value}</res:binding></res:result>}
</res:results>
</res:xml-results>

This pattern might be worth trying in your use-case too. More details about the solution and context is available in the paper http://www.biomedcentral.com/1472-6947/13/57

0

If the tree is a binary search tree, this has been implemented using XQuery. See this post:

http://dnovatchev.wordpress.com/2012/01/09/the-binary-search-tree-data-structurehaving-fun-with-xpath-3-0/

This may be too late to be of use... but... it looks like you're implementing abstract trees on top of concrete trees that are already defined in terms of abstract trees. Just use element trees directly instead of implementing trees with XML elements...

Paul Sweatte
  • 24,148
  • 7
  • 127
  • 265