5

Yesterday I asked how to replace text on a node with children using minidom.

Today I'm also trying to replace <node/> with <node>text</node>

Unfortunately I'm feeling that my results are a horrible hack:

import xml.dom.minidom
from   xml.dom.minidom import Node

def makenode(text):
    n = xml.dom.minidom.parseString(text)
    return n.childNodes[0]

def setText(node, newText):
    if node.firstChild==None:
        str =  node.toxml();
        n = len(str)
        str = str[0:n-2]+'>'+newText+'</'+node.nodeName+'>'   #DISGUSTINGHACK!
        node.parentNode.replaceChild(  makenode(str),node )
        return
    if node.firstChild.nodeType != node.TEXT_NODE:
        raise Exception("setText: node "+node.toxml()+" does not contain text")
    node.firstChild.replaceWholeText(newText)

def test():
    olddoc = '<test><test2/></test>'
    doc=xml.dom.minidom.parseString(olddoc)
    node = doc.firstChild.firstChild  # <test2/>
    print "before:",olddoc
    setText(node,"textinsidetest2")
    newdoc =  doc.firstChild.toxml()
    print "after: ", newdoc


 #  desired result:
 # newdoc='<test><test2>textinsidetest2</test2></test>'

test()

While the above code works, I feel it's a collossal hack. I've been poring through the xml.minidom documentation, and I'm not sure how else to do the above case, especially without the hack marked #DISGUSTINGHACK! above.

Community
  • 1
  • 1
Warren P
  • 65,725
  • 40
  • 181
  • 316
  • Do you *have* to use minidom? The [ElementTree API](http://docs.python.org/2/library/xml.etree.elementtree.html) is recommended over using the (verbose and cumbersome) DOM API. – Martijn Pieters Nov 28 '12 at 20:41
  • Well, I keep hearing that the minidom api is verbose and cumbersome, and it seems I'm not alone in pretty much hating it, but it's also a W3C standard, and I'm trying to figure out how to do it this way. I may drop minidom forever, but as a learning exercise, and as the documentation for minidom sucks, an SO question isn't so bad. It's unfair (I think) to leave minidom in and not deprecate it in Python and yet have so little documentation on the internet on how to use it. I feel that the above question ought to be so easy to answer it must just be that I'm not understanding `minidom` yet. – Warren P Nov 28 '12 at 20:53
  • The minidom is mainly there to support people *already* familiar with the W3C DOM API; the [minidom documentation](http://docs.python.org/2/library/xml.dom.minidom.html) directs everyone else to the ElementTree API instead. I once was part of a team what wrote a full DOM level 2 implementation in Python, and I had a jolly old time figuring out all the edge-cases that the standard fails to cover correctly.. – Martijn Pieters Nov 28 '12 at 21:14
  • That's great. I missed that provision in the docs. Maybe it should be red, and blinky? (Just kidding.) – Warren P Nov 29 '12 at 15:01

1 Answers1

6

You'll need to create a Text node, using Document.createTextNode(), then add it to the desired parent node using Node.appendChild() or similar method:

def setText(doc, node, newText):
    textnode = doc.createTextNode(newText)
    node.appendChild(textnode)

I've added a doc argument here for ease of use, call this with:

setText(doc, node, "textinsidetest2")

Your makenode function can be dropped altogether. With these modifications, your test() function prints:

before: <test><test2/></test>
after:  <test><test2>textinsidetest2</test2></test>
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Now that's much better. I feel better knowing that even if it's not the most intuitive API, the most simple set of basic operations for it are now documented on StackOverflow. – Warren P Nov 29 '12 at 15:00
  • @WarrenP: Note that I linked to the Python.org documentation for the methods; the whole DOM API supported is documented there. – Martijn Pieters Nov 29 '12 at 15:02
  • I guess what I mean by documented is that the set of basic operations that people would do using this API requires some mental gymnastics (thanks W3C!) that many people found difficult. The docs tell you what the methods in the API do. The part docs often don't do well is tell you how to do X with the methods available. – Warren P Nov 29 '12 at 15:09