11

I'm using JAXB to parse an XML file in my GWT based application. The XML looks like this (a simplified example):

<addressbook>

    <company name="abc">
        <contact>
            <name>...</name>
            <address>...</address>
        </contact>

        <contact>
            <name>...</name>
            <address>...</address>
        </contact>

        <contact>
            <name>...</name>
            <address>...</address>
        </contact>
        ... 
        ... 
    </company>

    <company name="def">
        <contact>
            <name>...</name>
            <address>...</address>
        </contact>
        ...
        ...
    </company>

    ...
    ...

</addressbook>

I've defined the classes as shown below:

@XmlRootElement(name="addressbook")
public class Addressbook implements Serializable {

    private ArrayList<Company> companyList = new ArrayList<Company>();

    public Addressbook() {            
    }

    @XmlElement(name = "company")
    public ArrayList<Company> getCompanyList() {
        return companyList;
    }


}

=============================

@XmlRootElement(name="company")
public class Company implements Serializable {

    private String name;

    private ArrayList<Contact> contactList = new ArrayList<Contact>();

    public Company() {      
    }

    @XmlAttribute
    public String getName() {
        return name;
    }

    @XmlElement(name = "contact")
    public ArrayList<Contact> getContactList() {
        return contactList;
    }

    ...
    ...
}

=============================

@XmlRootElement(name="contact")
public class Contact implements Serializable
{
    private String name;
    private String address;

    public Contact() {
    }

    @XmlElement
    public String getName ()
    {
        return name;
    }

    @XmlElement
    public String getAddress ()
    {
        return address;
    }

    ...
    ...
}

This is the code:

try {
    JAXBContext jc = JAXBContext.newInstance(Addressbook.class);
    Unmarshaller um = jc.createUnmarshaller();
    addressbook = (Addressbook) um.unmarshal(new FileReader("ds/addressbook.xml"));        
} catch (JAXBException e) {
    e.printStackTrace();
}

I need to get the list of contacts based on the company name. For example, get all contacts for company "abc". I can parse the entire XML file and then manually filter the records. But if the input file is big, it might be more efficient to parse only what I need. So is it possible to specify a criterion upfront and parse only specific records?

Thanks.

DFB
  • 861
  • 10
  • 25

2 Answers2

12

You could use the @XmlPath extension in EclipseLink JAXB (MOXy) to handle this case (I'm the MOXy tech lead):

@XmlRootElement(name="addressbook")
public class Addressbook implements Serializable {

    private ArrayList<Company> companyList = new ArrayList<Company>();

    public Addressbook() {            
    }

    @XmlPath("company[@name='abc']")
    public ArrayList<Company> getCompanyList() {
        return companyList;
    }


}

For More Information:


UPDATE - Using StreamFilter

The example below demonstrates how a StreamFilter could be leveraged for this use case:

import java.io.FileInputStream;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamReader;

public class Demo {

    public static void main(String[] args) throws Exception {
        JAXBContext jc = JAXBContext.newInstance(Addressbook.class);

        XMLInputFactory xif = XMLInputFactory.newFactory();
        FileInputStream xmlStream = new FileInputStream("input.xml");
        XMLStreamReader xsr = xif.createXMLStreamReader(xmlStream);
        xsr = xif.createFilteredReader(xsr, new CompanyFilter());

        Unmarshaller unmarshaller = jc.createUnmarshaller();
        Addressbook addressbook = (Addressbook) unmarshaller.unmarshal(xsr);

        Marshaller marshaller = jc.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
        marshaller.marshal(addressbook, System.out);
    }
}

The implementation of the StreamFilter is as follows:

import javax.xml.stream.StreamFilter;
import javax.xml.stream.XMLStreamReader;

public class CompanyFilter implements StreamFilter {

    private boolean accept = true;

    public boolean accept(XMLStreamReader reader) {
        if(reader.isStartElement() && "company".equals(reader.getLocalName())) {
            accept = "abc".equals(reader.getAttributeValue(null, "name"));
        } else if(reader.isEndElement()) {
            boolean returnValue = accept;
            accept = true;
            return returnValue;
        }
        return accept;
    }

}
bdoughan
  • 147,609
  • 23
  • 300
  • 400
  • Coincidently I was reading your blog for ideas when you were posting your response. I think this is what I'm looking for, but I would really prefer to avoid extra libraries for this if it's at all possible. Otherwise, I'll consider using MOXy. On another note, instead of unmarshalling the company objects in a List, can I unmarshall them as a Map? – DFB May 10 '11 at 14:29
  • @DFB - You could unmarshal the company objects to a Map (http://bdoughan.blogspot.com/2010/09/processing-atom-feeds-with-jaxb.html). If may be possible to use a StreamFilter to get the behaviour you want (http://download.oracle.com/javase/6/docs/api/javax/xml/stream/XMLInputFactory.html#createFilteredReader(javax.xml.stream.XMLStreamReader,%20javax.xml.stream.StreamFilter), using the standard JAXB APIs. – bdoughan May 10 '11 at 15:00
  • thanks for your response. I'll look into StreamFilter. Regarding unmarshalling into a Map, I have to admit that I wasn't able to figure it out from your blogpost. I guess I have a lot more to learn even before I start posting questions :) – DFB May 10 '11 at 15:22
  • @DFB I provided the wrong link to the Map example. The correct link is: http://bdoughan.blogspot.com/2010/07/xmladapter-jaxbs-secret-weapon.html – bdoughan May 10 '11 at 15:30
  • @DFB - I have updated my answer with a StreamFilter example that should work with any JAXB implementation. – bdoughan May 12 '11 at 15:23
  • @BlaiseDoughan Just posted a similar question http://stackoverflow.com/questions/22695541/filtering-out-elements-based-on-sub-elements-with-xmlstreamreader-and-streamfilt - what do you think? – NBW Mar 27 '14 at 17:58
  • @BlaiseDoughan: very cool response. I'm going to try this out when I get some bandwidth and will be interested to see what kind of performance improvements we see – IcedDante Oct 04 '16 at 18:47
1

You could either

  • Apply an XSLT transformation to the XML file, or
  • Unmarshall the file into a DOM, and use XPath to select the nodes you want

before passing the resulting object(s) to the unmarshal method

It might be simpler though, to create an in-memory Map keyed by company name:

public class SearchableAddressBook {

    public final Map<String, Company> companyMap = new HashMap<String,Company>();

    public SearchableAddressBook(List<Company> companyList) {
        for (Company company: companyList) {
            companyMap.add(company.getName(), company));
        }

}

Or create an in-memory DB if you really want over-engineer it.

artbristol
  • 32,010
  • 5
  • 70
  • 103
  • thanks for your response. Could you show (or point me to) sample code using the simpler of these approaches? Sorry, I'm still new to this stuff. – DFB May 10 '11 at 12:56
  • Updated my answer, though the Map approach still parses the entire XML file, so it might not be what you're looking for. Remember to measure the performance on a variety of datasets! – artbristol May 10 '11 at 13:20
  • Are you suggesting that by modifying my Addressbook class, the unmarshalling would get the data into a Map (that would be great, actually)? Or should I create a new class SearchableAddressBook to convert the list into a map "after" unmarshalling? Thanks. – DFB May 10 '11 at 13:59
  • No, I'm afraid you'd have to unmarshal first and then add into the new SearchableAddressBook. I think the other answer might be more suitable :-) – artbristol May 10 '11 at 15:03