2

I want to use XML as a small database to store articles. While I am using SAXParser to parse this XML, I got an ArrayIndexOutOfBoundsException.

I make a small example for this. XML File:

<?xml version="1.0" encoding="UTF-8"?>
<a>
    <b>
        <c>
            <![CDATA[
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111111111111111
111111111111111111111111111111111111111111
]]>
        </c>
    </b>
</a>

the exception I got:

java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at org.gjt.xpp.impl.tokenizer.Tokenizer.next(Tokenizer.java:1274)
    at org.gjt.xpp.impl.pullparser.PullParser.next(PullParser.java:392)
    at org.gjt.xpp.sax2.Driver.parseSubTree(Driver.java:415)
    at org.gjt.xpp.sax2.Driver.parse(Driver.java:310)
    at javax.xml.parsers.SAXParser.parse(SAXParser.java:392)
    at javax.xml.parsers.SAXParser.parse(SAXParser.java:328)
    at com.sumy.xmlwikimanager.dao.XMLUtil.parserXML(XMLUtil.java:28)
    at com.sumy.xmlwikimanager.dao.XMLUtil.main(XMLUtil.java:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

a parser handler:

package com.sumy.xmlwikimanager.dao;

import com.sumy.xmlwikimanager.bean.WikiItem;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

/**
 * Created by Sumy on 2015/11/27 0027.
 */
public class DatabaseParserHandler extends DefaultHandler {

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        System.out.println("startElement: uri[" + uri + "] localName[" + localName + "] qName[" + qName + "]"); 
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        super.characters(ch, start, length);
        System.out.println(length);
    }
}

XMLUtil.java just have a test method:

package com.sumy.xmlwikimanager.dao;

import org.xml.sax.SAXException;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.File;
import java.io.IOException;
/**
 * Created by Sumy on 2015/11/27 0027.
 */
public class XMLUtil {
    public static void parserXML(File file) {
        try {

            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser parser = factory.newSAXParser();
            parser.parse(file, new DatabaseParserHandler());
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    public static void main(String[] args) {

        parserXML(new File("Category.xml"));

    }
}

I test:
If the length of CDATA content less than 1034, the program work fine. While I add some character, the ArrayIndexOutOfBoundsException will throw.

Is there anything wrong on my program?

sumy
  • 281
  • 2
  • 4
  • 15

1 Answers1

0

Nothing wrong with your implementation. Since the document is well-formed it should not lead to an error.

It seems to be a bug of the parser implementation. There is already a bug report for the parser showing the same stacktrace.

wero
  • 32,544
  • 3
  • 59
  • 84
  • It seems the parser implementation in use is "XPP XML Pull Parser". Others have resorted to [using different implementation](https://jira.exoplatform.org/browse/ECMS-1532). – eis Nov 27 '15 at 10:27
  • is there any other _jar library_ for XML parser? I have used **Dom4j**, but it also have this problem. – sumy Nov 27 '15 at 12:35
  • thanks @eis. I use **org.w3c.dom** to parse a XML and use **DomReader** transforming a **org.w3c.dom.Document** to a **Dom4j Document**, avoid using **SAXReader**. It works almost fine. – sumy Nov 30 '15 at 05:54
  • thanks @wero . can't find a good solution for this problem, just avoid using "XPP XML Pull Parser" in the code. – sumy Nov 30 '15 at 05:56
  • 1
    @sumy You already using the interface `javax.xml.parsers.SAXParser` and not a specific implementation which is good. The JDK normally provides a different implementation than XPP Pull Parser. So you have to learn how to get rid of XPP. Maybe it is just as easy as to remove `pull-parser-xx.jar` from your project. Please study this http://stackoverflow.com/a/1804281/3215527 to learn about how the implementation is selected. If nothing else helps, post a new question. – wero Nov 30 '15 at 10:57
  • @wero thank you very much. I try use `javax.xml.parsers` which implement in JDK, and it seems working fine. But don't know the reason why the problem of `SAXParser` occur make me feel uncomfortable. (╯‵□′)╯︵┻━┻ – sumy Nov 30 '15 at 12:39