0

I've got a XML file which looks like this:

<Main>

    <Stuff author="Jojo" name="Thing 1">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description" />
        <Attr name="version" value="1.0.0" />
        <Attr name="software" value="Misrocoft Ociffe" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

    <Stuff author="Toto" name="Thing 2">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

</Main>

I'm updating it and then rewriting it but the problem is that if I rewriting it with prettyxml I've got new spaces between the old lines like this:

<Main>


    <Stuff author="Jojo" name="Thing 1">


        <Attr name="annotation" value="Short description" />


        <Attr name="description" value="Long description" />


        <Attr name="version" value="1.0.0" />


        <Attr name="software" value="Misrocoft Ociffe" />


        <Attr name="language" value="Python" />


        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />


        <Attr name="command" value="doSomething()" />


    </Stuff>

    <Stuff author="Toto" name="Thing 2">


        <Attr name="annotation" value="Short description" />


        <Attr name="description" value="Long description"/>


        <Attr name="version" value="4.3.9" />


        <Attr name="software" value="Tophoshop" />


        <Attr name="language" value="Python" />


        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />


        <Attr name="command" value="doSomething()" />


    </Stuff>

    <Stuff author="Titi" name="New thing">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

</Main>

And if I rewrite it toxml I've got no indentations or spaces at all like this:

<Main>

    <Stuff author="Jojo" name="Thing 1">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description" />
        <Attr name="version" value="1.0.0" />
        <Attr name="software" value="Misrocoft Ociffe" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

    <Stuff author="Toto" name="Thing 2">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

<Stuff author="Titi" name="New thing"><Attr name="annotation" value="Short description" /><Attr name="description" value="Long description"/><Attr name="version" value="4.3.9" /><Attr name="software" value="Tophoshop" /><Attr name="language" value="Python" /><Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" /><Attr name="command" value="doSomething()" /></Stuff></Main>

Is there a way to output a new pretty XML which will not modify the existing format of the file?
I was thinking of changing the XML to a one-line string and then re-writing it in prettyxml but I don't know how to do it or if it is possible (I'm using etree and minidom for info).


Update (answer):

Here is the code I finally made, note that my rootXml is from ElementTree.

from xml.dom import minidom
import xml.etree.ElementTree as ET

def writeXml(rootXml, xmlFile):

    roughString = ET.tostring(rootXml, 'utf-8')
    oneLineString = ''.join([s.strip() for s in roughString.splitlines()])

    minidomXml = minidom.parseString(oneLineString)
    rootMinidom = minidomXml.firstChild

    prettyXmlString = rootMinidom.toprettyxml()
    prettyXml = ET.fromstring(prettyXmlString)

    with open(xmlFile, "w") as f:
        f.write (ET.tostring(prettyXml))

Will return the following xml:

<Main>
    <Stuff author="Jojo" name="Thing 1">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description" />
        <Attr name="version" value="1.0.0" />
        <Attr name="software" value="Misrocoft Ociffe" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>
    <Stuff author="Toto" name="Thing 2">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>
    <Stuff author="Titi" name="New thing">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>
</Main>
UKDP
  • 226
  • 5
  • 21
  • 2
    you are wasting your time. White spaces between tags aren't important. Just ignore them – e4c5 May 13 '16 at 12:28
  • 1
    It's for readability mainly. ^^ – UKDP May 13 '16 at 12:33
  • 2
    @e4c5 So then why do pretty XML at all? just put it all into a file without whitespaces between tags. That'll make it easy to ready by a human... – Tom Myddeltyn May 13 '16 at 12:35
  • @busfault Oh I am sorry I didn't realize that there were people who actually read xml when others read comics – e4c5 May 13 '16 at 12:37
  • @e4c5 I guess you have never had the pleasure of trying to debug someone else's messed up XML. Your code must be wonderfully easy to debug. – Tom Myddeltyn May 13 '16 at 12:39
  • @busfault use an XML viewer for that. For them it doesn't matter whether you have no space or a 100 spaces. And they give you nice syntax hightlighting too. – e4c5 May 13 '16 at 12:41
  • @busfault I am sorry if it was interpreted as rude but that was just stating the facts. – e4c5 May 13 '16 at 12:43
  • @UKDP Sorry for hijacking your post. Anyway. can you post some of your code that you are doing the writing with? – Tom Myddeltyn May 13 '16 at 12:48
  • Ok, I actually found a way to do it which seems fast and neat, I'd like to answer my own question but can't find where is it... If someone can tell me, or I'll just update my question. – UKDP May 13 '16 at 13:35

1 Answers1

1

There is no clean way to fix minidom's toprettyxml() as far as I can find *. One of the easiest way is probably using BeautifulSoup's prettify(). For example, your single-line Stuff element is properly separated into new-lines with indentations by prettify() :

>>> from bs4 import BeautifulSoup
>>> raw = '''<Stuff author="Titi" name="New thing"><Attr name="annotation" value="Short description" /><Attr name="description" value="Long description"/><Attr name="version" value="4.3.9" /><Attr name="software" value="Tophoshop" /><Attr name="language" value="Python" /><Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" /><Attr name="command" value="doSomething()" /></Stuff>'''
>>> soup = BeautifulSoup(raw, "xml")
>>> print soup.prettify()
<?xml version="1.0" encoding="utf-8"?>
<Stuff author="Titi" name="New thing">
 <Attr name="annotation" value="Short description"/>
 <Attr name="description" value="Long description"/>
 <Attr name="version" value="4.3.9"/>
 <Attr name="software" value="Tophoshop"/>
 <Attr name="language" value="Python"/>
 <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext"/>
 <Attr name="command" value="doSomething()"/>
</Stuff>

*) References :

Community
  • 1
  • 1
har07
  • 88,338
  • 12
  • 84
  • 137