24

I Have an XML Node that I want to add children to over time:

val root: Node = <model></model>

But I cannot see methods such as addChild(), as I would like to write something along the lines of:

def addToModel() = {
    root.addChild(<subsection>content</subsection>)
}

So after a single call to this method the root xml would be:

<model><subsection>content</subsection></model>

The only class I can see that has the ability to append a Node is the NodeBuffer. Am I missing something fundamental here?

BefittingTheorem
  • 10,459
  • 15
  • 69
  • 96
  • I see you haven't accepted an answer yet. You may want to take a step back and see if you even need to use the rule transformer or the zipper or even "adding a child to a note". It may depend on whether the target xml document mirrors substantially the linear data that has some parent/child relationships in it and how large it is. For instance you could maintain a buffer of object and at the end do one pass to convert your buffer of objects into one xml document. Or you could create the `children` xml first and do `{children}` when the parent processing is done. – huynhjl Feb 05 '10 at 14:32
  • @huynhjl Thanks for the advice. I need to translate a spreadsheet into xml. So that rules out the Rule Transformer as it operates on mapping XML to XML. The spreadsheet has the tag name info but it's all linear, so I need to reinsert the nesting info. The way I would usually do this is by using a stack and adding to it when I know the current tag is a child and popping off when I know the current node is complete. So this is why I wanted to add children to an anonymous node. – BefittingTheorem Feb 06 '10 at 19:25

9 Answers9

30

Well start with this:

def addChild(n: Node, newChild: Node) = n match {
  case Elem(prefix, label, attribs, scope, child @ _*) =>
    Elem(prefix, label, attribs, scope, child ++ newChild : _*)
  case _ => error("Can only add children to elements!")
}

The method ++ works here because child is a Seq[Node], and newChild is a Node, which extends NodeSeq, which extends Seq[Node].

Now, this doesn't change anything, because XML in Scala is immutable. It will produce a new node, with the required changes. The only cost is that of creating a new Elem object, as well as creating a new Seq of children. The children node, themselves, are not copied, just referred to, which doesn't cause problems because they are immutable.

However, if you are adding children to a node way down on the XML hierarchy, things get complicated. One way would be to use zippers, such as described in this blog.

You can, however, use scala.xml.transform, with a rule that will change a specific node to add the new child. First, write a new transformer class:

class AddChildrenTo(label: String, newChild: Node) extends RewriteRule {
  override def transform(n: Node) = n match {
    case n @ Elem(_, `label`, _, _, _*) => addChild(n, newChild)
    case other => other
  }
}

Then, use it like this:

val newXML = new RuleTransformer(new AddChildrenTo(parentName, newChild)).transform(oldXML).head

On Scala 2.7, replace head with first.

Example on Scala 2.7:

scala> val oldXML = <root><parent/></root>
oldXML: scala.xml.Elem = <root><parent></parent></root>

scala> val parentName = "parent"
parentName: java.lang.String = parent

scala> val newChild = <child/>
newChild: scala.xml.Elem = <child></child>

scala>     val newXML = new RuleTransformer(new AddChildrenTo(parentName, newChild)).transform(oldXML).first
newXML: scala.xml.Node = <root><parent><child></child></parent></root>

You could make it more complex to get the right element, if just the parent isn't enough. However, if you need to add the child to a parent with a common name of a specific index, then you probably need to go the way of zippers.

For instance, if you have <books><book/><book/></books>, and you want to add <author/> to the second, that would be difficult to do with rule transformer. You'd need a RewriteRule against books, which would then get its child (which really should have been named children), find the nth book in them, add the new child to that, and then recompose the children and build the new node. Doable, but zippers might be easier if you have to do that too much.

Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • Thanks Daniel, a very complete answer (as per usual). The Elem pattern matching I understand. But I'll have to figure out the RuleTransformer example. – BefittingTheorem Feb 04 '10 at 15:47
  • 1
    @Brian `RuleTransformer` applies its list of `RewriteRules` recursively on all nodes of an XML. A `RewriteRules` receives a `Node` or `Seq[Node]`, and produces a modified version of it, or returns the original. I have often used it to answer XML questions on Scala, so you may search for me, Scala and XML to see for further examples. – Daniel C. Sobral Feb 04 '10 at 16:52
  • `child ++ newChild` should work with `Node` being a `NodeSeq`. Should work in 2.7 and 2.8. – huynhjl Feb 05 '10 at 00:27
  • @huynhjl True. I'll revise the answer to make it simpler. – Daniel C. Sobral Feb 05 '10 at 10:49
  • That works like a treat :)) - if you were to safe it to an XML file, is there any way to influence the way it gets tabbed...etc? – Galder Zamarreño Nov 11 '11 at 16:48
  • @DanielC.Sobral can u explain what : _* does? I've tried without it in scala 2.9.1 and it does not compile due it expecting Node* but the ++ produces Seq[Node]. – Dzhu Jan 02 '12 at 23:12
  • 1
    @Dzhu Check the questions in this search: http://symbolhound.com/?q=_* (make sure you copy the "*", since the link being auto-generated doesn't include it). – Daniel C. Sobral Jan 03 '12 at 13:19
9

In Scala xml nodes are immutable, but can do this:

var root = <model/>

def addToModel(child:Node) = {
  root = root match {
    case <model>{children@ _*}</model> => <model>{children ++ child}</model>
    case other => other
  }
}

addToModel(<subsection>content</subsection>)

It rewrites a new xml, by making a copy of the old one and adding your node as a child.

Edit: Brian provided more info and I figured a different to match.

To add a child to an arbitrary node in 2.8 you can do:

def add(n:Node,c:Node):Node = n match { case e:Elem => e.copy(child=e.child++c) }

That will return a new copy of parent node with the child added. Assuming you've stacked your children nodes as they became available:

scala> val stack = new Stack[Node]()
stack: scala.collection.mutable.Stack[scala.xml.Node] = Stack()

Once you've figured you're done with retrieving children, you can make a call on the parent to add all children in the stack like this:

stack.foldRight(<parent/>:Node){(c:Node,n:Node) => add(n,c)}

I have no idea about the performance implication of using Stack and foldRight so depending on how many children you've stacked, you may have to tinker... Then you may need to call stack.clear too. Hopefully this takes care of the immutable nature of Node but also your process as you go need.

huynhjl
  • 41,520
  • 14
  • 105
  • 158
  • I'm not sure `++` is applicable here, as you are adding a single node to a sequence of them. – Daniel C. Sobral Feb 04 '10 at 15:11
  • That works though! Must be the multiple personality disorder that Node has with also being a NodeSeq? – huynhjl Feb 05 '10 at 00:13
  • if this is used in a loop all the temp object instanciation will not be very performant and will use unnecessary memory. Think of immutable String concats in a for loop. If you're building a large document this way and doing this type of activity with lots of concurrent threads, this sounds like a fun production issue in the making :) – Andrew Norman Apr 05 '16 at 22:33
  • I think the only viable way here would be to work "with" (and not "around") the immutable nature of Elem. That would mean building the entire children node seq first and then constructing the parent element last once all the children have been constructed. – Andrew Norman Apr 05 '16 at 22:35
6

Since scala 2.10.0 the instance constructor of Elem has changed, if you want use naive solution written by @Daniel C. Sobral, it should be:

xmlSrc match {
  case xml.Elem(prefix, label, attribs, scope, child @ _*) =>
       xml.Elem(prefix, label, attribs, scope, child.isEmpty, child ++ ballot : _*)
  case _ => throw new RuntimeException
}

For me, it works very good.

Murdix
  • 61
  • 1
  • 3
3

Since XML are immutable , you have to create a new one each time you want to append a node, you can use Pattern matching to add your new node:

    var root: Node = <model></model>
    def addToModel(newNode: Node) = root match {
       //match all the node from your model
       // and make a new one, appending old nodes and the new one
        case <model>{oldNodes@_*}</model> => root = <model>{oldNodes}{newNode}</model>
    }
    addToModel(<subsection>content</subsection>)
Patrick
  • 15,702
  • 1
  • 39
  • 39
  • This is way more verbose than I was expecting :) But I'll go along with it. I presume I can use the Elem class to perform a generic match so that i can add elements to any node. As the node that I want to add to is not always called "model"? – BefittingTheorem Feb 04 '10 at 15:02
2

In the usual Scala fashion, all Node, Elem, etc. instances are immutable. You can work it the other way around:

  scala> val child = <child>foo</child>
  child: scala.xml.Elem = <child>foo</child>

  scala> val root = <root>{child}</root>
  root: scala.xml.Elem = <root><child>foo</child></root>

See http://sites.google.com/site/burakemir/scalaxbook.docbk.html for more information.

Confusion
  • 16,256
  • 8
  • 46
  • 71
  • 1
    This is true, but, I do not know all of the children ahead of time. I need to add the children as information becomes available. To be precise I'm converting a linear spreadsheet to XML. I do not know where a tag ends, until I run out of children – BefittingTheorem Feb 04 '10 at 12:32
2

I agree that you have to work with XML "the other way around". Keep in mind you don't have to have the entire XML document available when information becomes available, you only need to compose the XML when the application needs to read it.

Keep your subsection state however you want to, when you need the XML, wrap it all together.

  val subsections : List[Elem]

  def wrapInModel(f : => Elem) = {
    <model>{f}</model>
  }

  wrapInModel(subsections)

or

  def wrapInModel(f : => Elem) = {
    <model>{f}</model>
  }
  wrapInModel(<subsection>content</subsection>)
Travis Stevens
  • 2,198
  • 2
  • 17
  • 25
1

I implement my 'appendChild' method in the following way:

  def appendChild(elem: Node, child: Node, names: String) = {
    appendChild(elem, child, names.split("/"))
  }

  private def appendChild(elem: Node, child: Node, names: Array[String]) = {
    var seq = elem.child.diff(elem \ names.head)
    if (names.length == 1)
      for (re <- elem \ names.head)
        seq = seq ++ re.asInstanceOf[Elem].copy(child = re.child ++ child)
    else
      for (subElem <- elem \ names.head)
        seq = seq ++ appendChild(subElem, child, names.tail)
    elem.asInstanceOf[Elem].copy(child = seq)
  }

The method appends children to your nodes in recursive manner. In the 'if' statement it simply calls 'copy' method of Elem class to produce new instances of affected children (those may be plural). Then in 'else' statement recursive calls to 'appendChild' method verify resulting XML will be rebuilt. Before 'if-else' there are sequence which is built from non-affected children. At the end, we need to copy this sequence to origin element.

val baz = <a><z x="1"/><b><z x="2"/><c><z x="3"/></c><z x="4"/></b></a>
println("Before: \n" + XmlPrettyPrinter.format(baz.toString()))

val res = appendChild(baz, <y x="5"/>, "b/c/z")
println("After: \n" + XmlPrettyPrinter.format(res.toString()))

Results:

Before: 
<a>
  <z x="1"/>
  <b>
    <z x="2"/>
    <c>
      <z x="3"/>
    </c>
    <z x="4"/>
  </b>
</a>

After: 
<a>
  <z x="1"/>
  <b>
    <z x="2"/>
    <z x="4"/>
    <c>
      <z x="3">
        <y x="5"/>
      </z>
    </c>
  </b>
</a>
1

your root definition is actually an Elem object, a subclass of node, so if you drop your unnecessary Node typing (which hides its implementation) you could actually do a ++ on it since the Elem class has this method.

val root = <model/>
val myChild = <myChild/>
root.copy(child = root.child ++ myChild)

scala ev:

root: scala.xml.Elem = <model/>
myChild: scala.xml.Elem = <mychild/>
res2: scala.xml.Elem = <model><mychild/></model>

Since every Elem and every Node is a NodeSeq you can add these pretty effectively even if what you are appending is an unknown sequence:

val root = <model/>
//some node sequence of unknown subtype or structure
val children: scala.xml.NodeSeq = <node1><node2/></node1><node3/> 
root.copy(child = root.child ++ children)

scala ev:

root: scala.xml.Elem = <model/>
children: scala.xml.NodeSeq = NodeSeq(<node1><node2/></node1>, <node3/>)
res6: scala.xml.Elem = <model><node1><node2/></node1><node3/></model>
Andrew Norman
  • 843
  • 9
  • 22
1

Scales Xml allows for simple in place changes via folding over XPaths, adding in children to a particular sub node fits right into this approach.

See In-Place Transformations for more details.

Chris
  • 1,240
  • 7
  • 8