2

I've tried to read a big xml file (something like 500MB). First of all, I used xjc with the XSD file of my XML. All classes were generated as expected. Trying to read the file I've got this error: javax.xml.bind.UnmarshalException: unexpected element.

Here is my code:

(...)

JAXBContext context = JAXBContext.newInstance("br.com.mypackage");
Unmarshaller unmarshaller = context.createUnmarshaller();
File f = new File("src/files/MyHuge.CNX");
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
InputStream in = new FileInputStream(f);
XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
Person p = null;
int count = 0;
while (eventReader.hasNext()) {
   XMLEvent event = eventReader.nextEvent();
   if (event.isStartElement()) {
      StartElement startElement = event.asStartElement();
      if (startElement.getName().getLocalPart() == ("person")) {
         p = (Person) unmarshaller.unmarshal(eventReader);
      }
   }
}

The problem is in the unmarshal operation.

Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"identification"). Expected elements are <{}messageAll>

I used this link as example to make my own code: JAXB - unmarshal OutOfMemory: Java Heap Space

Someone has a clue to do it? All that I want now is to read a huge XML file without unmarshal the external object of XML (java heap space problem) and without reading tag by tag getting the respective value, a slow and monkey code (not the monkeys of Rise of the Planet of the Apes). :P

Many thanks.

Community
  • 1
  • 1
T Soares
  • 81
  • 1
  • 2
  • 6
  • Can u share the xml and the Classes and their jaxb mappings used here? Is there a class with annotation `@XmlRootElement(namespace="", name = "identification")` in the package `br.com.mypackage` – Arun P Johny Dec 13 '11 at 13:42
  • Arun, on the Person class, there is this annotation: `@XmlAccessorType(XmlAccessType.FIELD)` `@XmlType(name = "", propOrder = {"identification","address","whatever"})` So, I thought that the XJC would do all that small things related with annotations. Maybe is it a problem on the XSD file? – T Soares Dec 13 '11 at 18:43
  • Can you try to print the contents of the event reader before passing it to the unmarshaller? It looks like instead of passing the `person` element at the root you are passing an `identification` element. And the `Person` class should have `@XmlType(name = "person", propOrder = {"identification","address","whatever"})`. Can you also give the type of identification object. – Arun P Johny Dec 14 '11 at 03:16
  • I made a test. I tried to "unmarshal" the Identification object. It doens't work. It launchs the same exception: Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"identification"). Expected elements are <(none)> I edited the XML file removing persons. I left just 5 persons. With this small file, I made successfully the unmarshal operation using the most external object generated by the XJC. All 5 persons were created as expected. With this test, I don't think that's a annotations problem. (how can I send you the xsd file?) – T Soares Dec 14 '11 at 19:01

2 Answers2

2

I'm guessing that the problem is you've already consumed the <person> from the event stream so JAXB doesn't know what it is doing; it needs that element to be there so it can build the object. Thus, I suspect you need to peek the stream to decide whether to consume (and discard) or to unmarshal:

while (eventReader.hasNext()) {
   XMLEvent event = eventReader.peek();
   if (event.isStartElement()) {
      StartElement startElement = event.asStartElement();
      if (startElement.getName().getLocalPart() == ("person")) {
         p = (Person) unmarshaller.unmarshal(eventReader);
         continue; // Assume you've done something with p; go round loop again
      }
   }
   eventReader.nextElement(); // Discard...
}
Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
  • I tried it. In fact I posted a digest of my code. I'm getting the next element for each iteration of while loop. Anyway I tested with peek method (as you did) but It doesn't work. I would like to avoid the code using "switch way" to get each field and its value. Could you send me a link to a good tutorial? Maybe I don't understanding the purpose of unmarshal function and if attends my need. – T Soares Dec 13 '11 at 18:57
  • Hello all, problem solved. Here is the link to the solution: http://pastebin.com/JQ6uN9Te `if (start.getName().getLocalPart() == "person")) { JAXBElement jax_benef = unmarshaller.unmarshal(eventReader, Person.class); p = jax_benef.getValue(); }` I don't know why the old method wasn't working (unmarshall using the Person object instead of JAXBElement). Do you have some clue about it wasn't working? – T Soares Dec 14 '11 at 19:58
  • @TSoares: I don't know, but I guess it must have something to do with the amount of context available to JAXB to allow it to make a decision about what to do. (On the plus side, you no longer need an explicit cast since you know what you're getting.) – Donal Fellows Dec 15 '11 at 10:45
2

I solved the problem with this code bellow:

public List<Person> testeUnmarshal() {
  List<Person> people = new ArrayList<Person>();
  Person p = null;
  try {
    JAXBContext context = JAXBContext.newInstance(Person.class);
    Unmarshaller unmarshaller = context.createUnmarshaller();
    File f = new File(FILE_PATH);
    XMLInputFactory inputFactory = XMLInputFactory.newInstance();
    XMLEventReader eventReader = inputFactory.createXMLEventReader(new FileInputStream(f));
    while (eventReader.hasNext()) {
      XMLEvent event = eventReader.peek();
      if (event.isStartElement()) {
        StartElement start = event.asStartElement();
    if (start.getName().getLocalPart() == "person")) {
          JAXBElement<Person> jax_b = unmarshaller.unmarshal(eventReader, Person.class);
      p = jax_b.getValue();
    }
      }
      eventReader.next();
    }
  } catch (Exception e) {
  }
  return persons;
}

I can control the amount of objects in memory using counts inside a loop (for 1000 Persons commit in database).

T Soares
  • 81
  • 1
  • 2
  • 6