1

Please, Note: Novice user of Python.

Hi,

I am working with more than 1Gb of XML file. Using Python2.7. Initially, I was using 'iter' to parse the XML. It worked fine with small files but with file such big I was getting a memory error. Then, I read the documentation and found out that iter load the whole file into memory at once and I should use iterparse. I used and able to load the xml file and make modification while I parse it.

The problem I am facing now is how to write this parsed element tree into a file. The methods I found on Google were suggesting 'write' method of ElementTree which was parsed using 'iter' but mine is parsed using iterparse.

Below is my code snippet. I had commented lines because inner logic of code is pretty big. The only part where I am struggling is writing the updated tree into 'output_pre' file.

The structure of my xml file is like this:

<users>

<user pin=''>
</user>

<user pin=''>
</user>

</users>

Code(inner logic has been removed):

----------------Parser---------------------------

import xml.etree.cElementTree as ET2
import xml.etree.ElementTree as ET
from xml.etree.ElementTree import Element

output_pre = open("pre_ouput.xml", 'w')
tree = ET2.iterparse("temp-output-preliminary.xml")
for event, elem in tree:
    if elem.tag == "users":
        pass
    if elem.tag == "user":
        userContent = list(elem)
        #Number of children will help filter dummy users in user-state file.
        numberOfChildren = len(userContent)
        #assert numberOfChildren != 3
        PIN = elem.get('pin')
        assert PIN is not None
        analysing += 1
        logger.info ("Analysing user number: %d", analysing)
        if numberOfChildren <= 2:
        if numberOfChildren >=4:
        if numberOfChildren == 3:
            for e in ids:
                node = ET2.Element("property", {eid: PROV_DATA})
                elem.append(node)
                container_id_set.add(e)
tree.write(output_pre, encoding='unicode')
output_pre.write("\n</perk-users")
output_pre.close()

Thanks!

Community
  • 1
  • 1
rapport89
  • 109
  • 3
  • 14
  • Possible duplicate of [Using python ElementTree's itertree function and writing modified tree to output file](http://stackoverflow.com/questions/15399904/using-python-elementtrees-itertree-function-and-writing-modified-tree-to-output) – wwii Apr 02 '17 at 17:42

0 Answers0