-1

I would like to parse nested elements. I do not mind using XPath or Element. For example, a few of the values I would like to print are at:

>>> root[0][0][0][0][0].tag
'{http://www.domain.com/somepath/Schema}element'
>>> root[0][0][0][0][0].text
'findme'

What would be the ideal method to iterate through the XML document, parse, and print the element values? Here is an example of the schema I am working with.

<?xml version="1.0" encoding="UTF-8"?>
<data xsi:schemaLocation="http://www.domain.com/somepath/Schema file.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.domain.com/somepath/Schema">
    <one stuff0="" stuff1="">
        <two stuff0="" stuff1="">
            <three>
                <four stuff0="234234" stuff1="234324">
                    <element>findme</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme2</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme3</element>
                </four>
            </three>
        </two>  
    </one>
    <one stuff0="" stuff1="">
        <two stuff0="" stuff1="">
            <three>
                <four stuff0="234234" stuff1="234324">
                    <element>findme4</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme5</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme6</element>
                </four>
            </three>
        </two>  
    </one>
</data>

I have tried the following though no results are returned. Even if this did work it would not see though elements under root1[0]1[0][0] and so on:

>>> for tagname in root[0][0][1][0][0].findall('element'):
...   name = tree.get('element')
...   print name
...
>>>

Per this question, I have also tried the following without success:

>>> for elem in doc.findall('one/two/three/four'):
...     print value.get('stuff1'), elem.text
...
>>>

Problem found:

The element was not being read due to lack of namespace specification which I learned after reading Need Help using XPath in ElementTree. So the following example works:

>>> import xml.etree.cElementTree as ET
>>> for event, element in ET.iterparse("schema.xml"):
...     if element.tag == "{http://www.domain.com/somepath/Schema}element":
...        print element.text
...
findme
findme2
findme3
findme4
findme5
findme6
Community
  • 1
  • 1
Astron
  • 1,211
  • 5
  • 20
  • 42
  • Either of the libraries you link to is fine. – Patashu Mar 28 '13 at 03:36
  • I recommend [`cElementTree`](http://effbot.org/zone/celementtree.htm) over the `elementtree` module. It's compiled C code so it runs a just a little bit faster and using less memory, but has a very similar interface to `elementtree`. –  Mar 28 '13 at 03:45

1 Answers1

0

Without seeing your XML document I can't be sure, but I think what you want to do is:

test.xml

<?xml version="1.0"?>
<root>
  <group>
    <element>This is the first text</element>
  </group>
  <group>
    <element>This is the second text</element>
  </group>
  <group>
    <element>This is the third text</element>
  </group>
</root>

test.py

import xml.etree.cElementTree as ET

for event, element in ET.iterparse("test.xml"):
    if element.tag == "element":
       print element.text

Running those files in a terminal I get:

mike@tester:~$ python test.py
This is the first text
This is the second text
This is the third text
  • This is what I am looking for though it did not print output. I have updated the question with an example of the schema. – Astron Mar 28 '13 at 04:25
  • My issue was namespace usage (updated in question). How might I handle a namespace in your example? – Astron Mar 29 '13 at 01:13
  • To handle a namespace use `"{namespace-uri}element"` instead of `"element"`. Or alternatively `ET.QName("namespace-uri","element)`. –  Mar 29 '13 at 01:21
  • That did the trick, question updated with your answer/namespace usage. Thanks – Astron Mar 29 '13 at 02:31