5

I am recursing through an xml file, using etree.

import xml.etree.ElementTree as etree
tree = etree.parse('x.xml')
root = tree.getroot()
for child in root[0]:
 for child in child.getchildren():
        for child in child.getchildren():
            for child in child.getchildren():
               print(child.attrib)

what is the idiomatic way in python to avoid these nested for loop.

  getchildren() ⇒ list of Element instances [#]
    Returns all subelements. The elements are returned in document order.

Returns:
A list of subelements.

I saw some post in SO like, Avoiding nested for loops but doesn't directly translate to my use.

thanks.

Community
  • 1
  • 1
bsr
  • 57,282
  • 86
  • 216
  • 316
  • 1
    `itertools.product` is a nice way to avoid nested loops. Why doesn't that translate to your use? – David Cain Feb 08 '13 at 20:13
  • Do you specifically want attributes for elements 4 children deep? – bogatron Feb 08 '13 at 20:15
  • sorry, I didn't mean itertools.product doesn't suit me, but couldn't translate that example to arrays like in my case. I havn't done much Python, but will try. – bsr Feb 08 '13 at 20:30

2 Answers2

3

If you want to get the children that are n levels deep in the tree, and then iterate through them, you can do:

def childrenAtLevel(tree, n):
    if n == 1:
        for child in tree.getchildren():
            yield child
    else:
        for child in tree.getchildren():
            for e in childrenAtLevel(child, n-1):
                yield e

Then, to get the elements four levels deep, you would simply say:

for e in childrenAtLevel(root, 4):
     # do something with e

Or, if you want to get all of the leaf nodes (i.e. the nodes that don't have any children themselves), you can do:

def getLeafNodes(tree):
    if len(tree) == 0:
         yield tree
    else:
         for child in tree.getchildren():
            for leaf in getLeafNodes(child):
                yield leaf
Ord
  • 5,693
  • 5
  • 28
  • 42
2

itertools.chain.from_iterable will flatten one level of nesting; you can use functools.reduce to apply it n times (Compressing "n"-time object member call):

from itertools import chain
from functools import reduce

for child in reduce(lambda x, _: chain.from_iterable(x), range(3), root):
    print(child.attrib)

Note that getchildren is deprecated; iterating a node yields its children directly.

Community
  • 1
  • 1
ecatmur
  • 152,476
  • 27
  • 293
  • 366