1

Is there any way other than creating a method myself to write XML using python which are easily readable? xMLFile.write(xmlDom.toxml()) does create proper xml but reading them is pretty difficult. I tried toprettyxml but doesn't seem like it does much. e.g. following is what I would consider a human readable xml:

<myroot>
    <ent1
        att1="val1"
        att2="val2"
        att3="val3"
    />
    <ent2>
        <ent3
            att1="val1"
            att2="val2"
        />
    </ent2>
</myroot>

Solutions summary from the below responses:
1) If you just need to occassionaly read it for debugging then instead of formatting the xml while writing use xmllint or an xml editor to read it. Or
2) Use a library like Beautiful Soup. Or
3) Write your own method

dutch
  • 105
  • 1
  • 2
  • 5
  • What, in your opinion, makes an XML file "readable"? The term is very ambiguous. – Oded Apr 22 '10 at 08:00
  • Apologies. Am trying to paste an example. Just need to sort out how to put in tab indentation using the markup. – dutch Apr 22 '10 at 08:05
  • 1
    Do you just want this for easier debugging? I find myself using the "xmllint" program, with the "--format" option to throw my output through when I want to read it. – Matt Gibson Apr 22 '10 at 08:08
  • Thanks for the formatting. I need the xml to be parseable as well. Basically the python script reads the xml and after modifying / adding a few nodes writes it back. The script is supposed to do this daily. – dutch Apr 22 '10 at 08:12
  • Matt, Yes just to read occassionally if there is an issue with the application. xmllint while taking care of newline does not quite do the job if there are say 10 attributes to a node. – dutch Apr 22 '10 at 08:18

5 Answers5

6
print lxml.etree.tostring(root_node, pretty_print=True)
Walter
  • 7,809
  • 1
  • 30
  • 30
1

If you are using elementalTree this code will work.

XMLtext= ET.tostringlist(root)
OutputStr=''
level=-1

for t in XMLtext:

    if t[0]=='<':
            level=level+1
    if t[0:2]=='</':
            #remove the added level because of '<'
        level=level-1

    if t[-1]=='"' or t[0]=='<' or t[-2:]=='/>':
        t="\n"+"    "*level+t 

    if t[-2:]=='/>' or t[level*4+1:level*4+3]=='</': #end of block 
        level=level-1

    if t[-1:]=='>':    
        t=t+'\n'


    OutputStr=OutputStr+t

f = open("test.xml", "w")
f.write(OutputStr)
f.close()
PMN
  • 493
  • 5
  • 11
0

Unfortunately, XML readability by humans was not a design goal. Contrariwise, BeautifulStoneSoup makes an effort to prettify XML. It isn't great, but it is better than nothing.

msw
  • 42,753
  • 9
  • 87
  • 112
0

Write your own method for formatting the xml doc. For example this will produce the format of your example XML string:

rep={" ":" \n",
     "/>":"\n/>"}

outStr=xmlDom.toxml()

for k in rep:
        outStr=outStr.replace(k, rep[k])
zoli2k
  • 3,388
  • 4
  • 26
  • 36
0

I think the best way to do it, using python included batteries is the given in this answer: https://stackoverflow.com/a/1206856 .

import xml.dom.minidom

xml = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = xml.toprettyxml()
Community
  • 1
  • 1
tenuki
  • 73
  • 5