I would like to have a select statement on an XML document and one column should return me the path of each node.
For example, given the data
SELECT *
FROM TABLE(XMLSequence(
XMLTYPE('<?xml version="1.0"?>
<users><user><name>user1</name></user>
<user><name>user2</name></user>
<group>
<user><name>user3</name></user>
</group>
<user><name>user4</name></user>
</users>').extract('/*//*[text()]'))) t;
Which results in
column_value
--------
<user><name>user1</name></user>
<user><name>user2</name></user>
<user><name>user3</name></user>
<user><name>user4</name></user>
I'd like to have a result like this:
path value
------------------------ --------------
/users/user/name user1
/users/user/name user2
/users/group/user/name user3
/users/user/name user4
I can not see how to get to this. I figure there are two thing that have to work together properly:
- Can I extract the
path
from anXMLType
with a single operation or method, or do I have to do this with string-magic? - What is the correct XPath expression so that I do get the whole element path (if thats possible), eg.
<users><group><user><name>user3</name></user></group></user>
insead of<user><name>user3</name></user>
?
Maybe I am not understanding XMLType
fully, yet. It could be I need a different approach, but I can not see it.
Sidenotes:
- In the final version the XML document will be coming from CLOBs of a table, not a static document.
- The
path
column can of course also use dots or whatever and the initial slash is not the issue, any representation would do. - Also I would not mind if every inner node also gets a result row (possibly with
null
asvalue
), not only the ones withtext()
in it (which is what I am really interested in). - In the end I will need the tail element of
path
separate (always"name"
in the example here, but this will vary later), i.e.('/users/groups/user', 'name', 'user3')
, I can deal with that separately.