3

I am wondering what's the best practice to parse XML like this:

<root>
    <MailNotification enable="true">
        <To>foo@bar.org</To>
        <From>foo@bar.org</From>
        <Server>smtp.bar.org</Server>
        <Port>465</Port>
        <Username>foo@bar.org</Username>
        <Password>fooo!</Password>
    </MailNotification>
</root>

I am using Java 7, the complete XML is longer, but it's no really big file. I thought about using a Stax Pull Parser because it seemed easy, but there's one thing where I am not sure if it is really a good way:

When coming to a MailNotification element, I could e.g. create a new instance of e.g. a mail class, I have no problem with that. But: What if I come e.g. to an To element? How do I know if it is really inside a MailNotification element and not directly below the root? In other words: What I am missing is a best practice for handling states like "now I am in a MailNotification" element.

Note: I know I could verify the XML first, but imagine it would be allowed to have a To element inside a MailNotification element and a To element as children of another, semantically different element - same problem: I somehow need to keep track of states / context to make sure I interpret the To element correctly.

Thanks for any hint!

stefan.at.kotlin
  • 15,347
  • 38
  • 147
  • 270

6 Answers6

4

StAX Stream Reader are the best* choice. Just use the Java stack to keep your state, like in this example. The constants are XMLStreamConstants.

XMLStreamReader reader;

void parseRoot() {
    reader.require(START_ELEMENT, null, "root");

    while (reader.nextTag() == START_ELEMENT) {
        switch (reader.getLocalName()) {
        case "MailNotification":
            MailNotification mail = parseMail();
            // do something with mail
            break;
        // more cases
        }
    }

    reader.require(END_ELEMENT, null, "root");
}

MailNotification parseMail() {
    reader.require(START_ELEMENT, null, "MailNotification");
    MailNotification mail = new MailNotification();

    while (reader.nextTag() == START_ELEMENT) {
        switch (reader.getLocalName()) {
        case "To":
            mail.setTo(parseString());
            break;
        // more cases
        }
    }

    reader.require(END_ELEMENT, null, "MailNotification");
    return mail;
}

String parseString() {
    String text = "";
    if (reader.next() == CHARACTERS) {
        text = reader.getText();
        reader.next();
    }
    return text;
}

(*) Just to clarify on the "best choice", it depends on what you want to do.
JAXB is very good if your XML directly maps to the objects you want to create.
JDOM is useful if you want to navigate you XML in complex ways, eg, if you implement something like XPath; but for simple parsing its overkill. This is the approach that consumes most memory.
SAX was the lightest and most efficient parser before StAX was around.

Cephalopod
  • 14,632
  • 7
  • 51
  • 70
  • Thank you very much, I somehow wasn't with it - using a second (3rd, 4rd...) while loop inside the main while loop makes sense. Perfect fitting reply for my case. Thank to all the others, there are soem interesting things in the links. But for now Stax does the job. – stefan.at.kotlin May 15 '12 at 21:45
  • +1 because it's very short, lightweight, works in streaming mode and is exactly what I was looking for at the moment. – blafasel Sep 04 '15 at 14:27
2

Take a look at Digester.

public static final String TEST_XML = "<root>\n" +
          "<MailNotification>\n" +
          " <to>foo@bar.org</to>\n" +
          " <from>foo@bar.org</from>\n" +
          " </MailNotification>\n" +
          "</root>";



Digester digester = new Digester();
digester.setValidating(false);

digester.addObjectCreate("root/MailNotification", MailNotification.class);
digester.addBeanPropertySetter("root/MailNotification/to", "to");
digester.addBeanPropertySetter("root/MailNotification/from", "from");

MailNotification notification = (MailNotification) digester.parse(new StringReader(TEST_XML));
System.out.println(notification.getTo());
System.out.println(notification.getFrom());



public class MailNotification {
  private String to;
  private String from;

  public String getTo() {
    return to;
  }

  public void setTo(String to) {
    this.to = to;
  }

  public String getFrom() {
    return from;
  }

  public void setFrom(String from) {
    this.from = from;
  }
hammarback
  • 763
  • 7
  • 12
  • I actually ended up using Digester, really simple to use. Thanks for that great hint! As Arians answer is still closer to my original question, I will leave his question as the correct one though yours in the end was a little bit more helpful to me ;-) [but it's not 100% the answer to my original question in my opinion] – stefan.at.kotlin May 16 '12 at 19:49
1

How about using JAXB ? You can have a java class with annotations and just need to marshall or unmarshall which is quite easy.

kukudas
  • 4,834
  • 5
  • 44
  • 65
0

You can take a look at my previous answer :

XML response how to assign values to variables

And I'm sure there are many same/similar answers here on SO.

As to your question among few similar i.e :

How do I know if it is really inside a MailNotification element and not directly below the root?

you have start element/end element for that.

Community
  • 1
  • 1
ant
  • 22,634
  • 36
  • 132
  • 182
0

You'd parse it with any decent XML parsing library. Then a "To" would be contained within a "MailNotification" object.

There are tons of such, see this question for comparison. I've used jdom myself, it is easy to use and to understand which I value a lot. However, there are more advanced alternatives these days.

Community
  • 1
  • 1
eis
  • 51,991
  • 13
  • 150
  • 199
0

Asking what tool to use to parse XML seems to be a bit like asking what programming language you use: you will get answers saying "StAX is best" or "JAXB is best" without giving any justification of what benefits they offer over other approaches. To be honest, it's impossible to answer the question objectively without knowing more about the requirements and constraints of your project, but for the vast majority of projects the task is sufficiently easy using any of the popular technologies that it's not worth wasting time fretting about the decision.

I would probably use JDOM.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164