11

From some of our other application i am getting XML file.

I want to read that XML file node by node and store node values in database for further use.

So, what is the best way/API to read XML file and retrieve node values using Java?

Steve McLeod
  • 51,737
  • 47
  • 128
  • 184
Romani
  • 3,241
  • 4
  • 25
  • 28

8 Answers8

7

There are various tools for that. Today, I prefer two:

Here is a good comparison between the Simple and JAXB: http://blog.bdoughan.com/2010/10/how-does-jaxb-compare-to-simple.html

Personally, I like Simple a bit better because support by Niall is excellent but JAXB (as explained in the blog post above) can produce better output with less code.

StAX is a more basic API which allows you to read XML documents that simply don't fit into RAM (neither Simple nor JAXB allow you to read an XML document "object by object" - they will always try to load everything into RAM at once).

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • JAXB is simple to use as well: http://blog.bdoughan.com/2010/10/how-does-jaxb-compare-to-simple.html – bdoughan Aug 25 '11 at 09:44
  • JAXB doesn't so much "have" a streaming API as use one: StAX. It works just find stand-alone. As a matter of fact, unless the asker's XML has some complex structure, StAX would probably do just fine for his work. – G_H Aug 25 '11 at 11:44
  • Simple XML uses StAX, which is just as performant as SAX, so it can handle large files very well. – ng. Aug 30 '11 at 03:46
  • Thanks, I've improved my answer. – Aaron Digulla Aug 30 '11 at 08:01
4

I would advice for a simple XML tool if you can manage by that.

For example I and my colleges have introduces complex XML frameworks that worked like a charm at first. Then you forget about the framework, you have special build files just for mapping XML to beans, you have annotated beans, you provide a new barrier for new developers to your project. You loose much of your freedom to refactor.

At the end you will be sorry that you used the complex framework to save some time in the beginning and I have seen more than one time that the frameworks have been thrown out in refactoring because everybody had a negative feeling about it although they are great at paper.

So think twice about introducing complex XML frameworks if you seldom use them. If you and your team use them rather frequently then they are the way to go.

4

I suggest using XPath. Xalan is already included in the JDK (no external jars needed) and it fits your requirement, i.e. iterating through element nodes (i presume) and storing their text values. For example:

    String xml = "<root> <item>One</item> <item>Two</item> <item>Three</item> </root>";

    XPathFactory xpf = XPathFactory.newInstance();
    InputSource is = new InputSource(new StringReader(xml));
    NodeList nodes = (NodeList) xpf.newXPath().evaluate("/*/*", is,
            XPathConstants.NODESET);
    for (int i = 0; i < nodes.getLength(); ++i) {
        Element e = (Element) nodes.item(i);
        System.out.println(e.getNodeName() + " -> " + e.getTextContent());
    }
}

This example returns a list of all non-root elements and print out the corresponding element name and text content. Adapt the xpath expression to fit your needs.

forty-two
  • 12,204
  • 2
  • 26
  • 36
2

Try Apache Xerces. It is mature and robust. Any such available alternatives will do also, just be sure not to roll out your own implementation.

Oh Chin Boon
  • 23,028
  • 51
  • 143
  • 215
2

Bypassing alltogether the question of parsing the xml and storing the values in a database, I'd like to question the need to do the above. Most databases can handle xml nowadays, so it can be stored in some way into a table without the need of parsing the content; and the content of such an xml within a column in a table can typically be queried by use of 'xmlselect()' and similar functions.

Think about this for a second; if in the near or distant future the content of the xml that you get from the other application changes, you'll have plenty of changes to do. If it changes often, it'll become a nightmare.

Cheers, Wim

Wivani
  • 2,036
  • 22
  • 28
  • No, I can't store xml. because i can get the data in two ways, by filling form, or if it is already present then in xml format so to maintain uniformity i have to store node values into database. – Romani Aug 26 '11 at 09:01
  • Sorry but I don't understand. You say you 'get' the xml from the other application. What stops you from saving it in the database as such? – Wivani Aug 26 '11 at 09:07
1

dom4j and jdom are pretty easy to use (ignoring the requirement "best" for a moment ;) )

Andreas Dolk
  • 113,398
  • 19
  • 180
  • 268
  • 3
    Note that JDOM is pretty dead and doesn't even use Java 5 (Generics) to avoid a lot of casting. – Aaron Digulla Aug 25 '11 at 09:43
  • Voted up as I like jdom it's nice and easy, used it a lot long ago. Miss it :(, now it's more "enterprise" XML frameworks, granted I do much more automated and complex processing now. –  Aug 25 '11 at 09:43
  • @Aaaron Digulla - you make me feel old, I really used it frequently - looks like it was years/decades ago ;-) – Andreas Dolk Aug 25 '11 at 09:46
  • 2
    JDOM dead? I didn't know this. Been using it quite a bit, but stayed away from it since about a year or so because I'm going for more memory-efficient solutions. Speaking of which, I'd always advise against any DOM-like solution unless you really need the entire XML tree in memory. – G_H Aug 25 '11 at 11:45
  • There is a new version of JDOM with generics support. JDOM 2.0.0 brings JDOM in to the world of Generics and other Java language items introduced with Java 5. http://www.jdom.org/news/index.html – Christophe Roussy Oct 09 '14 at 07:35
0

Try XStream, this one's really simple.

Infeligo
  • 11,715
  • 8
  • 38
  • 50
  • JAXB is simple to use as well: http://blog.bdoughan.com/2010/10/how-does-jaxb-compare-to-xstream.html – bdoughan Aug 25 '11 at 09:48
0

well,i used stax to parse quite a huge of XML nodes, which consumes less memory than Dom and sax, cauz it is of style of pulling XML data. Stax might be a good choice for large XML data nodes.

David
  • 3,843
  • 33
  • 36