-2

I have an xml file in the below format that i get as a response from a service. It isn't of the traditional xml format where values are enclosed within respective tags. It's just a sample while the actual file will have hundreds of elements. How do i get to the node which I require (lets say 'TDE-2') in the most efficient manner and put its value in a map like {map(TenderID, TDE-2), map(ContactID, null)}

<xml version="1.0" encoding="UTF-8"?>
<report>
<report_header>
<c1>TenderID</c1>
<c2>ContactID</c2>
<c3>Address</c3>
<c4>Description</c4>
<c5>Date</c5>
</report_header>
<report_row>
<c1>TDE-1</c1>
<c2></c2>
<c3></c3>
<c4>Tender 1</c4>
<c5>09/30/2016</c5>
</report_row>
<report_row>
<c1>TDE-2</c1>
<c2></c2>
<c3></c3>
<c4>Tender 2</c4>
<c5>10/02/2016</c5>
</report_row>
</report>
Blisskarthik
  • 1,246
  • 8
  • 20
Aman Bharti
  • 17
  • 1
  • 7
  • What exactly is the „difference“ you are talking about in your comments to the answers? It’s straightforward XML, and just needs to be parsed, no? Can you clarify what the exact problem is? – Pekka Sep 20 '16 at 10:24
  • Use SAX or StaX to parse the XML. While reading the header you can store the column descriptions. And later on each row you can use the column description to column relation to identify the interesting columns. But don't expect us to write the whole code for you. You should start by yourself and show us your code, if you're stuck. – vanje Sep 20 '16 at 10:24
  • @vanje I have written the code as you suggested; but in this way i would be sequentially traversing till the end of the xml file if my interesting data lies in the last element. I'm looking for a piece of advice to optimize my code and not the raw code. – Aman Bharti Sep 20 '16 at 11:12
  • If you want faster access per key you have several possibilities. (But all of them requires to parse the whole XML document at least once.) 1. If your memory is large enough, you can create a hash map and then use the map to access the data. 2. Put the data in a database table and create the appropriate indices. Your data structure fits perfectly here for ordinary relational databases. 3. Use a XML database like eXist or BaseX. But I wouldn't recommend it for your data, because it is essentially a flat table without a hierarchical structure. – vanje Sep 20 '16 at 12:37
  • A fourth alternative: 4. Create your own index manually. Parse the XML file and in your index hash map store the file position for each element along with the key column's value. This works only if the key data isn't too large for main memory, but it needs less space than to hold all data in memory. Then you use the map to find the file position for your key and then use a random access file to read only that element from the file. But this is a lot work. I would use a database table maybe with an embedded database system like H2 or Apache Derby. – vanje Sep 20 '16 at 12:46

1 Answers1

1

JAXB allows you to deserialise XML into Java Objects. If you create Java POJOs to match the XML document model, you can then use JAXB to unmarshal the XML in the POJO.

for example:

POJOs:

Report.java:

import java.util.List;

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement
public class Report {

    private List<ReportRow> reportRows;

    public List<ReportRow> getReportRows() {
        return reportRows;
    }

    @XmlElement(name = "report_row")
    public void setReportRows(List<ReportRow> reportRows) {
        this.reportRows = reportRows;
    }
}

ReportRow.java

import javax.xml.bind.annotation.XmlElement;

public class ReportRow {

private String c1;
private String c2;
private String c3;
private String c4;

public String getC1() {
    return c1;
}

@XmlElement
public void setC1(String c1) {
    this.c1 = c1;
}

public String getC2() {
    return c2;
}

@XmlElement
public void setC2(String c2) {
    this.c2 = c2;
}

public String getC3() {
    return c3;
}

@XmlElement
public void setC3(String c3) {
    this.c3 = c3;
}

public String getC4() {
    return c4;
}

@XmlElement
public void setC4(String c4) {
    this.c4 = c4;
}

}

Code to read your XML and bind it into java objects:

import java.io.File;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Unmarshaller;

import org.junit.Test;

public class JaxbTest {

    @Test
    public void testFoo() throws JAXBException {

        File xmlFile = new File("src/test/resources/reports.xml");
        JAXBContext context = JAXBContext.newInstance(Report.class, ReportRow.class);
        Unmarshaller jaxbUnmarshaller = context.createUnmarshaller();
        Report report = (Report) jaxbUnmarshaller.unmarshal(xmlFile);
        ReportRow reportYouWant = report.getReportRows().stream().filter(reportRow -> reportRow.getC1().equals("TDE-1"))
                .findFirst().get();

    }
}

You also need to add the following dependencies to your build script:

compile group: 'javax.xml', name: 'jaxb-impl', version: '2.1'
compile group: 'javax.xml', name: 'jaxb-api', version: '2.1'
robjwilkins
  • 5,462
  • 5
  • 43
  • 59
  • I'm getting an error as "no suitable method found for unmarshal" when i create an object of class ReportCollection. Can you guide me on this? – Aman Bharti Sep 20 '16 at 11:41
  • I have update the examples in my answer. I have tested this code and it works. If you're happy with it can you please accept it as the answer. – robjwilkins Sep 20 '16 at 14:23