Convert String XML fragment to Document Node in Java

Question

In Java how can you convert a String that represents a fragment of XML for insertion into an XML document?

e.g.

String newNode =  "<node>value</node>"; // Convert this to XML

Then insert this node into an org.w3c.dom.Document as the child of a given node?

Does this answer your question? [How can I parse a HTML string in Java?](https://stackoverflow.com/questions/1497946/how-can-i-parse-a-html-string-in-java) — Suma, Apr 29 '21 at 06:50

score 67 · Answer 1 · edited Aug 17 '11 at 13:43

67

Element node =  DocumentBuilderFactory
    .newInstance()
    .newDocumentBuilder()
    .parse(new ByteArrayInputStream("<node>value</node>".getBytes()))
    .getDocumentElement();

edited Aug 17 '11 at 13:43

user882611

47
1
6

answered Apr 08 '09 at 12:02

izb

50,101
39
117
168

3

the .parse( new StringInputStream(.... should read .parse( new ByteArrayInputStream( new String( "xml" ).getBytes() ) ); – Steen Nov 19 '09 at 13:45
5

I just hate these commentboxes and their lack of markup ( or markdown, for that matter) – Steen Nov 19 '09 at 13:46
4

but this does not copy the children... for example if you do this in case of "blah blah It only gets without its children – grobartn Jun 07 '12 at 20:49
1

This didn't work for me because it didn't copy children as noted by grobartn. @McDowell's solution did work. – Upgradingdave Nov 09 '12 at 14:53

McDowell · Answer 2 · 2011-04-03T22:04:04.820

33

You can use the document's import (or adopt) method to add XML fragments:

  /**
   * @param docBuilder
   *          the parser
   * @param parent
   *          node to add fragment to
   * @param fragment
   *          a well formed XML fragment
   */
  public static void appendXmlFragment(
      DocumentBuilder docBuilder, Node parent,
      String fragment) throws IOException, SAXException {
    Document doc = parent.getOwnerDocument();
    Node fragmentNode = docBuilder.parse(
        new InputSource(new StringReader(fragment)))
        .getDocumentElement();
    fragmentNode = doc.importNode(fragmentNode, true);
    parent.appendChild(fragmentNode);
  }

edited Apr 03 '11 at 22:04

answered Apr 08 '09 at 12:05

McDowell

107,573
31
204
267

7

Hmm. If this is the simplest solution, I must say it's rather complicated for such a small problem. – Jonik Apr 08 '09 at 12:10
I've pared it down to the minimum - it still uses what you get in the JRE API, though, so a bit of verbosity is unavoidable. – McDowell Jun 10 '09 at 11:28
3

That is exactly what I was looking for. I didn't realize that I had to import the fragment into the dom before appending it to the parent node! – Tony Eichelberger Jul 24 '09 at 15:43
If verbosity you not want, you must not use Java, Luke. Thanks for the answer, no chance for anyone to figure that out. – Akku Oct 09 '12 at 12:41
Althoug the selected answer is correct given what the user asked, this anser is 'more' correct. – chessofnerd Jul 25 '13 at 18:49
Cheers @McDowell! Your solution is still saving the day, even in 2019... – dbaltor May 23 '19 at 15:38

score 15 · Answer 3 · edited May 23 '17 at 10:31

For what it's worth, here's a solution I came up with using the dom4j library. (I did check that it works.)

Read the XML fragment into a org.dom4j.Document (note: all the XML classes used below are from org.dom4j; see Appendix):

  String newNode = "<node>value</node>"; // Convert this to XML
  SAXReader reader = new SAXReader();
  Document newNodeDocument = reader.read(new StringReader(newNode));

Then get the Document into which the new node is inserted, and the parent Element (to be) from it. (Your org.w3c.dom.Document would need to be converted to org.dom4j.Document here.) For testing purposes, I created one like this:

    Document originalDoc = 
      new SAXReader().read(new StringReader("<root><given></given></root>"));
    Element givenNode = originalDoc.getRootElement().element("given");

Adding the new child element is very simple:

    givenNode.add(newNodeDocument.getRootElement());

Done. Outputting originalDoc now yields:

<?xml version="1.0" encoding="utf-8"?>

<root>
    <given>
        <node>value</node>
    </given>
</root>

Appendix: Because your question talks about org.w3c.dom.Document, here's how to convert between that and org.dom4j.Document.

// dom4j -> w3c
DOMWriter writer = new DOMWriter();
org.w3c.dom.Document w3cDoc = writer.write(dom4jDoc);

// w3c -> dom4j
DOMReader reader = new DOMReader();
Document dom4jDoc = reader.read(w3cDoc);

(If you'd need both kind of Documents regularly, it might make sense to put these in neat utility methods, maybe in a class called XMLUtils or something like that.)

Maybe there are better ways to do this, even without any 3rd party libraries. But out of the solutions presented so far, in my view this is the easiest way, even if you need to do the dom4j <-> w3c conversions.

Update (2011): before adding dom4j dependency to your code, note that it is not an actively maintained project, and has some other problems too. Improved version 2.0 has been in the works for ages, but there's only an alpha version available. You may want to consider an alternative, like XOM, instead; read more in the question linked above.

If dom4j is a NO-GO, try this solution: http://stackoverflow.com/a/7607435/363573 — Stephan, Nov 15 '16 at 17:30
Latest dom4j release is April 2020 at time of writing. The project is still getting development. — Lucas Holt, Apr 08 '22 at 20:48

score 6 · Answer 4 · edited Apr 11 '16 at 10:20

/**
*
* Convert a string to a Document Object
*
* @param xml The xml to convert
* @return A document Object
* @throws IOException
* @throws SAXException
* @throws ParserConfigurationException
*/
public static Document string2Document(String xml) throws IOException, SAXException, ParserConfigurationException {

    if (xml == null)
    return null;

    return inputStream2Document(new ByteArrayInputStream(xml.getBytes()));

}


/**
* Convert an inputStream to a Document Object
* @param inputStream The inputstream to convert
* @return a Document Object
* @throws IOException
* @throws SAXException
* @throws ParserConfigurationException
*/
public static Document inputStream2Document(InputStream inputStream) throws IOException, SAXException, ParserConfigurationException {
    DocumentBuilderFactory newInstance = DocumentBuilderFactory.newInstance();
    newInstance.setNamespaceAware(true);
    Document parse = newInstance.newDocumentBuilder().parse(inputStream);
    return parse;
}

score 6 · Answer 5 · edited May 23 '17 at 12:34

Here's yet another solution, using the XOM library, that competes with my dom4j answer. (This is part of my quest to find a good dom4j replacement where XOM was suggested as one option.)

First read the XML fragment into a nu.xom.Document:

String newNode = "<node>value</node>"; // Convert this to XML
Document newNodeDocument = new Builder().build(newNode, "");

Then, get the Document and the Node under which the fragment is added. Again, for testing purposes I'll create the Document from a string:

Document originalDoc = new Builder().build("<root><given></given></root>", "");
Element givenNode = originalDoc.getRootElement().getFirstChildElement("given");

Now, adding the child node is simple, and similar as with dom4j (except that XOM doesn't let you add the original root element which already belongs to newNodeDocument):

givenNode.appendChild(newNodeDocument.getRootElement().copy());

Outputting the document yields the correct result XML (and is remarkably easy with XOM: just print the string returned by originalDoc.toXML()):

<?xml version="1.0"?>
<root><given><node>value</node></given></root>

(If you wanted to format the XML nicely (with indentations and linefeeds), use a Serializer; thanks to Peter Štibraný for pointing this out.)

So, admittedly this isn't very different from the dom4j solution. :) However, XOM may be a little nicer to work with, because the API is better documented, and because of its design philosophy that there's one canonical way for doing each thing.

Appendix: Again, here's how to convert between org.w3c.dom.Document and nu.xom.Document. Use the helper methods in XOM's DOMConverter class:

// w3c -> xom
Document xomDoc = DOMConverter.convert(w3cDoc);

// xom -> w3c
org.w3c.dom.Document w3cDoc = DOMConverter.convert(xomDoc, domImplementation);  
// You can get a DOMImplementation instance e.g. from DOMImplementationRegistry

Note that instead of new Builder().build(new StringReader("")); you can also use new Builder().build("", "test.xml"); (where "test.xml" is some random base URI) — Peter Štibraný, Jun 06 '09 at 21:46
"If you wanted to format the XML nicely (with indentations and linefeeds), I'm not sure how to do that with XOM." -- using Serializer class. Configure it using setIndent and setMaxLength, and call write(document). — Peter Štibraný, Jun 06 '09 at 21:50
Thanks! I didn't really understand what exactly is the meaning of the baseURI param; passing an empty string also works, so I'm using that. In any case, that does simplify the code somewhat. For formatting, Serializer indeed works fine. — Jonik, Jun 07 '09 at 00:47
I think baseURI would be used for resolving relative references to DTD or XInclude (http://lists.ibiblio.org/pipermail/xom-interest/2004-November/001498.html) — Peter Štibraný, Jun 07 '09 at 06:08

score 5 · Answer 6 · edited Jul 21 '15 at 15:18

5

If you're using dom4j, you can just do:

Document document = DocumentHelper.parseText(text);

(dom4j now found here: https://github.com/dom4j/dom4j)

edited Jul 21 '15 at 15:18

james.garriss

12,959
7
83
96

answered Jun 23 '09 at 11:34

ronz

51
1
1

Just went to their web site. They place Google Ads right into the typical Maven-generated navigation bar! Incredible! – Thilo Oct 12 '10 at 07:57
2

Apparently, the site is no longer operated by the dom4j guys, but some domain grabbers took over... – Thilo Oct 12 '10 at 07:58

score 1 · Answer 7 · answered Apr 04 '14 at 07:48

1

Try jcabi-xml, with a one liner:

Node node = new XMLDocument("<node>value</node>").node();

answered Apr 04 '14 at 07:48

yegor256

102,010
123
446
597

jcabi-xml build error ```Unresolved references to [com.jcabi.xml] by class(es) on the Bundle-Classpath[Jar:dot]``` – Ikenna Anthony Okafor May 02 '19 at 10:04

score 1 · Answer 8 · answered Sep 21 '10 at 11:33

...and if you're using purely XOM, something like this:

    String xml = "<fakeRoot>" + xml + "</fakeRoot>";
    Document doc = new Builder( false ).build( xml, null );
    Nodes children = doc.getRootElement().removeChildren();
    for( int ix = 0; ix < children.size(); ix++ ) {
        otherDocumentElement.appendChild( children.get( ix ) );
    }

XOM uses fakeRoot internally to do pretty much the same, so it should be safe, if not exactly elegant.

Convert String XML fragment to Document Node in Java

8 Answers8

Linked