1

Since getting text content from an xml element requires 15 lines of code (see official oracle tutorial) here http://java.sun.com/webservices/reference/tutorials/jaxp/html/dom.html , the quoted tutorial itself suggests, for many needs to use thirdy party tools:

"As you can see, when you are using DOM, even a simple operation such as getting the text from a node can take a bit of programming. So if your programs handle simple data structures, then JDOM, dom4j, or even the 1.4 regular-expression package (java.util.regex) may be more appropriate for your needs"

I've tried the suggested tools and they are reasonably easy to use and complete BUT they require an evaluation of the "vitality" of their developement
. And this evaluation is not ever obvious.
So my questions are:

1) is there a library that eases xml work, built on top of standard dom? it would ensure the robusteness and up-to-dateness of official library with more usability
2) is this "usability" lack about to be adressed in some other way (perhaps some new jsr?), in the oracle plans?

AgostinoX
  • 7,477
  • 20
  • 77
  • 137

3 Answers3

2

If getting text content from XML elements is what you want, use XPath:

    String xml = "<root><p>This is some text</p><p>And this is more text</p></root>";
    XPath xpath = XPathFactory.newInstance().newXPath();
    String text = xpath.evaluate("/root/p[1]", new InputSource(
            new StringReader(xml)));
    System.out.println(text);

If you want to map Java objects to XML and back, use JAXB.

Both XPath (1.0) and JAXB are part of the JDK, or they can be replaced with later versions.

However, if you try to parse XML with regular expression, you are doomed!

Community
  • 1
  • 1
forty-two
  • 12,204
  • 2
  • 26
  • 36
  • 1
    +1 for XPath and the javax.xml.xpath APIs. A JAXB implementation is included with Java SE 6. Since JAXB is a specification (JSR-222) there are also other implementations available: Metro, EclipseLink MOXy, Apache JaxMe, etc. – bdoughan Sep 20 '11 at 13:15
  • @Blaise:we already unmistified this in other questions (http://stackoverflow.com/questions/7430843/jaxb-and-class-instantiation): when you don't need binding a binding technology forces you to a convoluted approach.In binding arena there are good products (i think to moxy) that are enforced by having at their core the JAXB standard. This doesn't happen in the DOM arena, where the only standard is the not-productivity-oriented w3c standard. in this years there have been big steps in binding products, i was wondering if the technologies under the JAXP umbrella were mature for a "rethinking" too. – AgostinoX Sep 20 '11 at 13:28
  • 1
    @AgostinoX - There is definitely a role for DOM to play in Java today. In your use case you want to select text from an element, you can use the DOM APIs to do this, but the XPath (javax.xml.xpath) APIs are a better choice. These APIs can take DOM as input, but can also act on XML representations provider by SAX, StAX, and JAXB. – bdoughan Sep 20 '11 at 18:56
  • @forty-two: starting from your answer I begin using java XPath API (before i used dom4j integrated xpath support). I would like to read something official about XPath API, but incredibly the standard java (ex-sun) tutorials don't talk about java XPath api but about XPath itself! I posted a question http://stackoverflow.com/questions/7512257/java-xpath-api-reference – AgostinoX Sep 22 '11 at 09:25
0

XStream is good and easy to use.

You can do almost everything with annotations.

http://x-stream.github.io/

facundofarias
  • 2,973
  • 28
  • 27
pablosaraiva
  • 2,343
  • 1
  • 27
  • 38
  • It looks like XmlBeans but inside out. Or can it generate classes by xsd too? – Andrei LED Sep 20 '11 at 11:26
  • It can not. In fact, it recommends XmlBeans for the job. – pablosaraiva Sep 20 '11 at 11:40
  • i get the suggestion and i'll give a look at xstream, however I rephrase the question "can we ensure reasonable long-living and timely updates for the thirdy party tools? are there attempts to ease the use of the standard way, since it would likely be mantained and timely updated and bug fixed by oracle? – AgostinoX Sep 20 '11 at 12:10
  • Nobody can ensure updates for thirdy party tools, but I'm using XStream for a large project and it works fine. The last update was august 11, 2011. So it's being updated. – pablosaraiva Sep 20 '11 at 13:12
  • Check out my comparison of JAXB and XStream (http://blog.bdoughan.com/2010/10/how-does-jaxb-compare-to-xstream.html). Note I lead a JAXB implementation (EclipseLink MOXy), but I feel the comparison is fair. XStream did release 1.4/1.4.1 in August 2011, the previous release 1.3.1 was in December 2008: http://xstream.codehaus.org/changes.html. – bdoughan Sep 20 '11 at 13:25
0

jOOX may be precisely the library you're looking for:

  • It wraps standard Java DOM
  • It is quite lightweight
  • It is inspired by jquery, which is a proven means of simplifying DOM manipulations

An example of getting the text content of an Element:

$(document).find("element").text();
$(document).xpath("//element[3]").text();
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509