4

I am producing some XML in a java application. I am looking at the variety of XML parsing options. I am not intending to do anything more than traverse the structure and extract values out of it. I need to use one of them that is built into the Java API (1.5+) without any additional plugins. I don't need to create "events" or transform it into anything else. I am not producing XML, merely reading and extracting data. I am not enforcing a schema either.

Sun provide a list here, but it's not really obvious what I should use.

http://java.sun.com/developer/technicalArticles/xml/JavaTechandXML/

What would be the most appropriate XML API to use in this case ? JAXP ? JDom ? XPath ?

angryITguy
  • 9,332
  • 8
  • 54
  • 82
  • Jaxp, JDOM and XPath have not been updated for a while. I think for ease of use, vtd-xml may be worth looking into. – vtd-xml-author Mar 01 '11 at 21:43
  • @vtd-xml-author - The JDK patch releases often contain updates to the JAXP implementations included. Don't confuse spec updates with implementation updates. – bdoughan Mar 01 '11 at 21:53
  • @All Thanks for all the rapid responses. I am getting it down to DOM or XPath. Sounds like DOM. Would XPath be any easier ? – angryITguy Mar 01 '11 at 22:36
  • 1
    Check out the following question. It has two answers one demonstrates DOM the other XPath: http://stackoverflow.com/questions/3717215/remove-xml-node-using-java-parser – bdoughan Mar 02 '11 at 00:19

9 Answers9

5

I think using a DOM parser to parse the XML and load it into memory in a Document sounds sufficient for your needs.

You wouldn't use XPath in that case, just the Document API.

JAXP is just a synonym for the XML parsing technology build into the JDK. The term JAXP (P is for Parsing) distinguishes it from JAXB (B is for binding).

Some 3rd party libraries built on top of DOM might make your life easier. Think about JDOM or DOM4J.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • DOM sounds like what's required. I need to keep it standard within the JDK. No 3rd party solutions/plugins/addons unfortunately. I need to extract the data from the XML, so XPath would be over kill for this purpose ? – angryITguy Mar 01 '11 at 22:34
  • If you parse XML into a DOM tree, you should use the API provided for you to access data. – duffymo Mar 02 '11 at 00:32
  • Thanks... will give DOM a go. – angryITguy Mar 02 '11 at 00:37
  • A belated comment on this: the statement that "JAXP is just a synonym for the XML parsing technology build into the JDK" is absolutely not true. JAXP is an interface (API) supported by multiple XML parsers, and the parser built into the JDK is just one implementation of it. Moreover, the scope of JAXP extends beyond parsing - it also covers validation and transformation. – Michael Kay Mar 03 '20 at 13:47
  • Probably true nine years after the question was first answered, but at the time it might not have been. I hope this late comment helps somebody. – duffymo Mar 03 '20 at 13:51
2

The most classical way of doing things in IMO would be combination of JAXP and XPath. Java 5.0 includes JAXP 1.3 and this is standard stuff. Please see this answer to a similar question for a minimalist coding sample.

Community
  • 1
  • 1
Alain Pannetier
  • 9,315
  • 3
  • 41
  • 46
1

Using the standard DOM Parser is good enough for your purpose. Try out this example.

user1931858
  • 10,518
  • 1
  • 17
  • 6
1

I think that the most practical tool to use is XStream, from ThoughtWorks. Some modern mvc frameworks like VRaptor use it to serve and consume xml. Take a look at: http://x-stream.github.io/

facundofarias
  • 2,973
  • 28
  • 27
Luciano Costa
  • 3,492
  • 2
  • 16
  • 8
  • Checkout: http://bdoughan.blogspot.com/2010/10/how-does-jaxb-compare-to-xstream.html – bdoughan Mar 01 '11 at 21:50
  • 2
    this has absolutely nothing to do with the question. the question is about xml parsing. your answer refers to xml serialisation... – Chris Mar 01 '11 at 21:56
  • 1
    "I am not intending to do anything more than traverse the structure and extract values out of it." Indeed, my bad! – Luciano Costa Mar 02 '11 at 15:29
1

DOM parser is what you looking for i think. easy to implement it and it has fast searching node capability

masay
  • 923
  • 2
  • 17
  • 34
1

As the parsing strategy you can use either DOM strategy which has the advantage that the hole document is kept in memory and you can access it via xpath. i recommend this if you have small xml documents or if you really NEED all the data to be present and accessable all the time because this consumes a lot of heap space.

if you have bigger documents or if you dont need to access the all the time you should either use the SAX method or the Stax method (xml pull parsing) if this is available in your java distribution. These methods are event based. so they traverse through the xml tree and make a kind of callback to a class defined by you. so you can react on events like "element xy starts" "element xy ends"

Chris
  • 7,675
  • 8
  • 51
  • 101
  • I think DOM Parser is the solution for the project. I found this example that demonstrates the complexity. http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/ – angryITguy Mar 01 '11 at 22:32
0

XOM.

Use xpath.

Stefan Kendall
  • 66,414
  • 68
  • 253
  • 406
0

If it is very trivial - do it in SAX parser.

Shamik
  • 6,938
  • 11
  • 55
  • 72
0

It seems that SAX is the API you want.

Google "SAX Parsing" and you will find many examples.

DwB
  • 37,124
  • 11
  • 56
  • 82