2

How can I express to PostgreSQL that I want values simultaneously from several hierarchical levels in an XPath query?

I have a document (in a PostgreSQL XML value) with a multi-level hierarchy. For this question, an example can be created with:

SELECT XMLPARSE(DOCUMENT '
    <parrots>
        <parrot name="Fred">
            <descriptor>Beautiful plumage</descriptor>
            <descriptor>Resting</descriptor>
        </parrot>
        <parrot name="Ethel">
            <descriptor>Pining for the fjords</descriptor>
            <descriptor>Stunned</descriptor>
        </parrot>
    </parrots>
    ') AS document
INTO TEMPORARY TABLE parrot_xml;

I can get different levels of information from that document.

=> SELECT
        (XPATH('./@name', parrot.node))[1] AS name
    FROM (             
        SELECT
            UNNEST(XPATH('./parrot', parrot_xml.document))
                AS node
        FROM parrot_xml
        ) AS parrot
    ;
 name  
-------
 Fred
 Ethel
(2 rows)

=> SELECT
        (XPATH('./text()', descriptor.node))[1] AS descriptor
    FROM (
        SELECT
            UNNEST(XPATH('./parrot/descriptor', parrot_xml.document))
                AS node
        FROM parrot_xml
        ) AS descriptor
    ;
      descriptor       
-----------------------
 Beautiful plumage
 Resting
 Pining for the fjords
 Stunned
(4 rows)

What I can't figure out, though, is how to get multiple levels joined, so that the query returns each descriptor related with the parrot to which it applies.

=> SELECT
        ??? AS name,
        ??? AS descriptor
    FROM
        ???
    ;
 name         descriptor       
------- -----------------------
 Fred    Beautiful plumage     
 Fred    Resting               
 Ethel   Pining for the fjords 
 Ethel   Stunned               
(4 rows)

How can this be done? What should go in place of the “???”s?

A single complex XPath query – but how to refer to multiple levels at once? Several XPath queries – but then how is the ancestor–descendant information preserved for the resulting relation? Something else?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
bignose
  • 30,281
  • 14
  • 77
  • 110

1 Answers1

4

Try this:

SELECT (xpath('./@name', parrot.node))[1] AS name
     , unnest(xpath('./descriptor/text()', parrot.node)) AS descriptor
FROM  (             
   SELECT unnest(xpath('./parrot', parrot_xml.document)) AS node
   FROM   parrot_xml
   ) parrot;

Produces exactly the requested output.

First, in the subquery, I retrieve whole parrot-nodes. One node per row.

Next, I get the name and the descriptors with xpath(). Both are arrays. I take the first (and only) element of name and split the descriptor array with `unnest(), thereby arriving at the desired result.

I wrote a comprehensive answer to a related question recently. May be of interest to you.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Very nice, but it also shows the limitations of handling XML with PostgreSQL. Hmmm. An `xslt_process()` with an `` option would be perfect :-). – Tim Landscheidt Feb 26 '12 at 15:30