2

What would be the simplest way to extract data from XML in Java? The XML data is always in the form:

<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<groups>GROUPNAME</groups>

and all I want to do is capture the groupname in a string. I've tried using a regular expression, but I'm struggleing to write the pattern code:

String xmlline = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<groups>XWiki.G_SW_DEV</groups>";
String pattern = "";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(xmlline);
if(m.find()){
    ....                     
}

As described in: http://www.tutorialspoint.com/java/java_regular_expressions.htm

Any advice on the pattern code or is there a better way to extract the XML data?

Gerrie van Wyk
  • 679
  • 8
  • 27
  • 2
    Use libraries like XPath or similar, not regular expressions. You wouldnt drive a screw with a hammer if you have a screwdriver already at home ;-) – Aron_dc Dec 07 '15 at 08:14

3 Answers3

1

Classical advice: dont use regex. It seems easy at begining, but it's not so easy.

and there are many classical libraries for XML which do it well !

see this: Java:XML Parser

and then, you only need to do this to retrieve elements:

doc.getElementsByTagName("groups")

getElementsByTagName

and, something like that

doc.getElementsByTagName("method").item(0).getTextContent() 
Community
  • 1
  • 1
0

Use JAXB marshalling

Example: If you want to read a customer.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<customer id="100">
    <age>29</age>
    <name>Martin</name>
</customer>

Step 1: Map the xml to class file

import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement
public class Customer {

String name;
int age;
int id;

public String getName() {
    return name;
}

@XmlElement
public void setName(String name) {
    this.name = name;
}

public int getAge() {
    return age;
}

@XmlElement
public void setAge(int age) {
    this.age = age;
}

public int getId() {
    return id;
}

@XmlAttribute
public void setId(int id) {
    this.id = id;
}

}

Step 2: Write a reader class


import java.io.File;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Unmarshaller;

public class ReadXml {

    public static void main(String[] args) throws JAXBException {
        File file = new File("D:\\customer.xml");
        JAXBContext jaxbContext = JAXBContext.newInstance(Customer.class);

        Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
        Customer customer = (Customer) jaxbUnmarshaller.unmarshal(file);

        System.out.println("Name :" + customer.getName());
        System.out.println("Age :" + customer.getAge());
        System.out.println("Id :" + customer.getId());

}

}


output:

Name :Martin
Age :29
Id :100
0

In the Java ecosystem exist three different xml-parser approaches. Two of them are stream-based, working with events. The third uses a tree-model (DOM --> Document Object Model) of the document. The stream models are suitable for large documents, whereas the DOM parser are usually considered easier/nicer to program.


Stream/Event:

  • SAX (Simple API for XML) --> you can implement a DocumentHandler that can react to certain events. "Push"-Parser, as it "pushes" events to your Handler.
  • StAX (Stream API for XML) --> "Pull"-Parser, you have to fetch the next event from the parser and can react to it.

See also: this question

DOM:

Take a look at the Oracle Tutorial

Community
  • 1
  • 1
Christopher
  • 694
  • 7
  • 15