access ElementTree node parent node

Question

I am using the builtin Python ElementTree module. It is straightforward to access children, but what about parent or sibling nodes? - can this be done efficiently without traversing the entire tree?

See http://stackoverflow.com/questions/374245/how-to-retrieve-the-parent-node-using-celementtree — kennytm, Jan 31 '10 at 05:53

score 68 · Accepted Answer · edited Sep 16 '20 at 12:44

68

There's no direct support in the form of a parent attribute, but you can perhaps use the patterns described here to achieve the desired effect. The following one-liner is suggested (updated from the linked-to post to Python 3.8) to create a child-to-parent mapping for a whole tree, using the method xml.etree.ElementTree.Element.iter:

parent_map = {c: p for p in tree.iter() for c in p}

edited Sep 16 '20 at 12:44

0 _

10,524
11
77
109

answered Jan 31 '10 at 08:04

Vinay Sajip

95,872
14
179
191

5

Syntax update, 2017 / python3 `parent_map = {(c,p) for p in tree.iter( ) for c in p}` – gerardw Sep 07 '17 at 22:02
8

Correction: `parent_map = {c:p for p in root.iter( ) for c in p}` – gerardw Sep 07 '17 at 22:10
1

What about if you cannot read the whole XML file in one go, but must iterate over a file with iter()? – GettingItDone Aug 03 '20 at 13:10

score 32 · Answer 2 · edited May 23 '17 at 11:54

32

Vinay's answer should still work, but for Python 2.7+ and 3.2+ the following is recommended:

parent_map = {c:p for p in tree.iter() for c in p}

getiterator() is deprecated in favor of iter(), and it's nice to use the new dict list comprehension constructor.

Secondly, while constructing an XML document, it is possible that a child will have multiple parents, although this gets removed once you serialize the document. If that matters, you might try this:

parent_map = {}
for p in tree.iter():
    for c in p:
        if c in parent_map:
            parent_map[c].append(p)
            # Or raise, if you don't want to allow this.
        else:
            parent_map[c] = [p]
            # Or parent_map[c] = p if you don't want to allow this

edited May 23 '17 at 11:54

Community

1
1

answered Nov 21 '13 at 21:31

supergra

1,578
13
19

2

What if you don't have access to the tree? Like after a .find() – Brett Jul 27 '15 at 19:38
1

I don't know of any way to get the root node (and thus parents/ancestors) if you didn't save a reference to it. But I don't understand how `.find()` has anything to do with that. – supergra Jul 28 '15 at 20:20
i just used `.find()` as an example function that just returns an element – Brett Jul 29 '15 at 21:39

score 27 · Answer 3 · answered Oct 22 '15 at 12:18

27

You can use xpath ... notation in ElementTree.

<parent>
     <child id="123">data1</child>
</parent>

xml.findall('.//child[@id="123"]...')
>> [<Element 'parent'>]

answered Oct 22 '15 at 12:18

josven

399
4
7

This is fantastic solution, works with find() also if you know there's just a single element that you are looking for. Like so: `root.find(".//*[@testname='generated_sql']...")` – Bostone Sep 08 '17 at 17:24
2

I could not find anything about this `...` XPath syntax. What does it do? Are there docs on it? – raphinesse May 23 '18 at 17:29
1

@raphinesse `...` expression comes from XPath 1.0. Python Std Library have limited support for XPath expressions, lxml have more support. – josven Aug 22 '18 at 13:05
The code in the answer does work, but I cannot find any reference to this "triple dot" syntax anywhere. It is not mentioned in the XPath 1.0 recommendation. – mzjn Apr 09 '20 at 08:07
What about elements that do not have an `id` attribute? – 0 _ Sep 16 '20 at 12:37
1

@ioannis-filippidis Oh, you just need a valid XPath followed by an ... You can use any attribute All children: `xml.findall('.//child...')` Some other attribute: `xml.findall('.//child[@other="123"]...')` – josven Sep 17 '20 at 13:06
4

**Attention:** This code works perfectly with only **two** dots. There is no such thing as "triple dot syntax". It's not [in the docs](https://docs.python.org/3/library/xml.etree.elementtree.html#supported-xpath-syntax), as others mentioned. It's just a combination of `.` (select current node) and `..` (get parent). – Nat Riddle May 19 '21 at 21:55

score 12 · Answer 4 · edited May 23 '17 at 12:26

12

As mentioned in Get parent element after using find method (xml.etree.ElementTree) you would have to do an indirect search for parent. Having xml:

<a>
 <b>
  <c>data</c>
  <d>data</d>    
 </b>
</a>

Assuming you have created etree element into xml variable, you can use:

 In[1] parent = xml.find('.//c/..')
 In[2] child = parent.find('./c')

Resulting in:

Out[1]: <Element 'b' at 0x00XXXXXX> 
Out[2]: <Element 'c' at 0x00XXXXXX>

Higher parent would be found as:secondparent=xml.find('.//c/../..') being <Element 'a' at 0x00XXXXXX>

edited May 23 '17 at 12:26

Community

1
1

answered Nov 24 '15 at 22:19

Vaasha

881
1
10
19

this is brilliant, would upvote more if I could – aclong Jul 06 '22 at 14:51

sashoalm · Answer 5 · 2022-05-08T16:09:28.883

Pasting here my answer from https://stackoverflow.com/a/54943960/492336:

I had a similar problem and I got a bit creative. Turns out nothing prevents us from adding the parent info ourselves. We can later strip it once we no longer need it.

def addParentInfo(et):
    for child in et:
        child.attrib['__my_parent__'] = et
        addParentInfo(child)

def stripParentInfo(et):
    for child in et:
        child.attrib.pop('__my_parent__', 'None')
        stripParentInfo(child)

def getParent(et):
    if '__my_parent__' in et.attrib:
        return et.attrib['__my_parent__']
    else:
        return None

# Example usage

tree = ...
addParentInfo(tree.getroot())
el = tree.findall(...)[0]
parent = getParent(el)
while parent:
    doSomethingWith(parent)
    parent = getParent(parent)
stripParentInfo(tree.getroot())

score 6 · Answer 6 · answered Jul 04 '18 at 07:56

The XPath '..' selector cannot be used to retrieve the parent node on 3.5.3 nor 3.6.1 (at least on OSX), eg in interactive mode:

import xml.etree.ElementTree as ET
root = ET.fromstring('<parent><child></child></parent>')
child = root.find('child')
parent = child.find('..') # retrieve the parent
parent is None # unexpected answer True

The last answer breaks all hopes...

score 3 · Answer 7 · answered Jan 03 '21 at 21:27

3

Got an answer from

https://towardsdatascience.com/processing-xml-in-python-elementtree-c8992941efd2

Tip: use '...' inside of XPath to return the parent element of the current element.


for object_book in root.findall('.//*[@name="The Hunger Games"]...'):
    print(object_book)

answered Jan 03 '21 at 21:27

Skanda

51
6

This is the same answer as https://stackoverflow.com/a/33280875/407651. – mzjn Jan 06 '21 at 17:10
stackoverflow.com/a/33280875/407651 doesn't say what ... is, This answer does. – Ted Shaneyfelt Jul 05 '21 at 06:47

Shadow · Answer 8 · 2018-03-08T04:13:38.287

0

If you are using lxml, I was able to get the parent element with the following:

parent_node = next(child_node.iterancestors())

This will raise a StopIteration exception if the element doesn't have ancestors - so be prepared to catch that if you may run into that scenario.

edited Mar 08 '18 at 04:13

answered Dec 04 '14 at 04:04

Shadow

8,749
4
47
57

4

element object in lxml has getparent() method – Alexey Shrub Feb 05 '22 at 16:49

score 0 · Answer 9 · answered Jul 01 '23 at 22:46

Most solutions posted so far

either use XPath… but Python does not support finding ancestors with XPath in general (see comment),
or post-process the whole tree after it is built (e.g. this answer or that one)… but this requires parsing and building the whole tree, which might be undesirable with large XML data (e.g. Wikipedia dumps).

If you are parsing XML incrementally, say with xml.etree.ElementTree.iterparse or xml.etree.ElementTree.XMLPullParser, you can keep track of the current path (up from the root node down to the current node) by tracking the opening and closing of tags (start and end events). Example:

import xml.etree.ElementTree as ET

current_path = [ ]

for event, elem in ET.iterparse('test.xml', events=['start', 'end']):
    # opening tag:
    if event == 'start':
        current_path.append(elem)
    # closing tag:
    else:
        assert event == 'end'
        assert len(current_path) > 0 and current_path[-1] is elem
        current_path.pop()
        parent = current_path[-1] if len(current_path) > 0 else None
        # `elem` is the current element (fully built),
        # `parent` is its parent (some of its children after `elem`
        # might not have been parsed yet)
        #
        # ... do something ...

score -1 · Answer 10 · answered Feb 23 '14 at 02:41

-1

Another way if just want a single subElement's parent and also known the subElement's xpath.

parentElement = subElement.find(xpath+"/..")

answered Feb 23 '14 at 02:41

MK at Soho

322
1
3
10

11

Doesn't work for me, I get 'None' - same if i just use `subElement.find('..')`. – damian Jan 21 '15 at 14:44
2

Assumes that a variable called `xpath` already exists, so it's not helpful for most people. – ArtOfWarfare May 22 '20 at 18:20

score -1 · Answer 11 · edited Sep 21 '22 at 15:02

-1

import xml.etree.ElementTree as ET

f1 = "yourFile"

xmlTree = ET.parse(f1)

for root in xmlTree.getroot():
    print(root.tag)

edited Sep 21 '22 at 15:02

Michael M.

10,486
9
18
34

answered Sep 16 '22 at 06:42

adnan

1

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Sep 20 '22 at 21:26

score -2 · Answer 12 · answered Dec 13 '17 at 23:29

-2

Look at the 19.7.2.2. section: Supported XPath syntax ...

Find node's parent using the path:

parent_node = node.find('..')

answered Dec 13 '17 at 23:29

Alf

13
1

5

Did you test this? If you were able to make it work, please post a complete code example that demonstrates it. See this comment: https://stackoverflow.com/questions/2170610/access-elementtree-node-parent-node#comment44519212_21963494 – mzjn Dec 14 '17 at 07:59
8

The Python 3 documentation says: "Returns `None` if the path attempts to reach the ancestors of the start element (the element `find` was called on)." (https://docs.python.org/3/library/xml.etree.elementtree.html#supported-xpath-syntax). – mzjn Dec 14 '17 at 08:36
Works for me. The best and most consise answer. – ToTenMilan Feb 06 '18 at 16:48

access ElementTree node parent node

12 Answers12

Linked

Related