2

I am trying to write to an xml file. I have changed a specific element in my code, and am able to get it to print successfully. I need to have it written to the file, without changing the structure of the file.
My code:

import os
from lxml import etree


directory = '/Users/eeamesX/work/data/expert/EFTlogs/20160725/IT'
XMLParser = etree.XMLParser(remove_blank_text=True)
for f in os.listdir(directory):
    if f.endswith(".xml"):

        xmlfile = directory + '/' + f



        tree = etree.parse(xmlfile, parser=XMLParser)
        root = tree.getroot()

        hardwareRevisionNode = root.find(".//hardwareRevision")



        if hardwareRevisionNode.text == "5":
            print " "
            print "Old Tag: " + hardwareRevisionNode.text

            x = hardwareRevisionNode.text = "DVT2"

            print "New Tag " +  hardwareRevisionNode.text

When I try various methods of opening and closing the file, it just deletes all the data in the xml file. Using this method

outfile = open(xmlfile, 'w')
oufile.write(etree.tostring(tree))
outfile.close()

Changed the code structure of my file to be one long line.

Anekdotin
  • 1,531
  • 4
  • 21
  • 43
  • There are pretty printers or indenters in lxml. One long string is perfectly valid xml. See this question http://stackoverflow.com/questions/749796/pretty-printing-xml-in-python – WombatPM Aug 26 '16 at 18:15

2 Answers2

2

To get newlines in the output file, it looks like you need to pass pretty_print=True to your sterilization (write or tostring) call.

Side note; Normally, when you open files with python you open them like this:

with open('filename.ext', 'mode') as myfile:
    myfile.write(mydata)

This way there is less risk of file descriptor leaks. The tree.write("filename.xml") method looks like a nice and easy way to avoid dealing with the file entirely.

Community
  • 1
  • 1
pnovotnak
  • 4,341
  • 2
  • 27
  • 38
  • tree = etree.parse(xmlfile, parser=XMLParser) root = tree.getroot() tree.write(xmlfile) hardwareRevisionNode = root.find(".//hardwareRevision") – Anekdotin Aug 26 '16 at 18:01
  • Changing to that results in the file being made into one line, and the text not being replaced – Anekdotin Aug 26 '16 at 18:02
  • In the above snippet you are re-saving the original tree. See my edited answer for the newline problem. – pnovotnak Aug 26 '16 at 18:06
  • It corrected the one line, but it doesnt write to the actual file. It still says 5, instead of DVT3 – Anekdotin Aug 26 '16 at 18:14
  • tree = etree.parse(xmlfile, parser=XMLParser) root = tree.getroot() tree.write(xmlfile, pretty_print=True) hardwareRevisionNode = root.find(".//hardwareRevision") – Anekdotin Aug 26 '16 at 18:16
2

If you are wanting to replace a value within an existing XML file then use:

tree.write(xmlfile) 

Currently you are simply overwriting your file entirely and using the incorrect method (open()). tree.write() is normally what you would want to use. It might look something like this:

tree = etree.parse(xmlfile, parser=XMLParser)
root = tree.getroot()
hardwareRevisionNode = root.find(".//hardwareRevision")
if hardwareRevisionNode.text == "5":
    print "Old Tag: " + hardwareRevisionNode.text
    hardwareRevisionNode.text = "DVT2"
    print "New Tag: " +  hardwareRevisionNode.text
    tree.write(xmlfile)

https://docs.python.org/2/library/xml.etree.elementtree.html

l'L'l
  • 44,951
  • 10
  • 95
  • 146