I am currently working on a project that creates a TCP socket and listens to the server for incoming xml. The xml are fairly large at times which will come around 1-3 mb. The xml keeps coming from the socket and I need to parse it as it comes. I tried out many parsers like DomParser, XMLPullParser and SaxParser. Sax seemed to be the fastest so I proceeded with that. But now I get OutOfMemory expeception sometimes.
I read in this post that we should data to the parser in chunks.
How to parse huge xml data from webservice in Android application?
Can some one tell me how that is done. My current code is like
InputSource xmlInputSource = new InputSource(new StringReader(response));
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = null;
XMLReader xr = null;
try{
sp = spf.newSAXParser();
xr = sp.getXMLReader();
ParseHandler xmlHandler = new ParseHandler(context.getSiteListArray().indexOf(website), context);
xr.setContentHandler(xmlHandler);
xr.parse(xmlInputSource);
postSuccessfullParsingNotification();
}catch(SAXException e){
e.printStackTrace();
}catch(ParserConfigurationException e){
e.printStackTrace();
}catch (IOException e){
e.printStackTrace();
e.toString();
}
Where response is the string I receive from the from the socket.
Should look into other parsers like VTD-XML? Or is there a way to make Sax work efficiently?
Btw: Whenever a new string arrives in the socket to be parsed I open a new thread for parsing the string.
This is my handler code
public class ParseHandler extends DefaultHandler {
private Website mWebsite;
private Visitor mVisitor;
private VisitorInfo mVisitorInfo;
private boolean isVisit;
private boolean isVisitor;
private AppContext appContext;
public ParseHandler(int index,AppContext context){
appContext = context;
mWebsite = appContext.getSiteListArray().get(index);
}
@Override
public void startDocument() throws SAXException {
super.startDocument();
}
@Override
public void startElement(String namespaceURI, String localName,String qName, Attributes atts)
throws SAXException {
if(localName.equals("visit")) {
isVisit = true;
} else if(localName.equals("visitor") && isVisit) {
isVisitor = true;
mVisitor = new Visitor();
mVisitor.mDisplayName = "Visitor - #"+atts.getValue("id");
mVisitor.mVisitorId = atts.getValue("id");
mVisitor.mStatus = atts.getValue("idle");
} else if(localName.equals("info") && isVisitor){
mVisitorInfo = mVisitor.new VisitorInfo();
mVisitorInfo.mBrowser = atts.getValue("browser");
mVisitorInfo.mBrowserName = atts.getValue("browser").replace("+", " ");
mVisitorInfo.mCity = atts.getValue("city").replace("+", " ");
mVisitorInfo.mCountry = atts.getValue("country");
mVisitorInfo.mCountryName = atts.getValue("country");
mVisitorInfo.mDomain = atts.getValue("domain");
mVisitorInfo.mIp = atts.getValue("ip");
mVisitorInfo.mLanguage = atts.getValue("language");
mVisitorInfo.mLatitude = atts.getValue("lat");
mVisitorInfo.mLongitude = atts.getValue("long");
mVisitorInfo.mOrg = atts.getValue("org").replace("+", " ");
mVisitorInfo.mOs = atts.getValue("os");
mVisitorInfo.mOsName = atts.getValue("os").replace("+", " ");
mVisitorInfo.mRegion = atts.getValue("region").replace("+", " ");
mVisitorInfo.mScreen = atts.getValue("screen");
}
}
@Override
public void characters(char ch[], int start, int length) {
}
@Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException {
if(localName.equals("visit")) {
isVisit = false;
} else if(localName.equals("visitor")) {
isVisitor = false;
if(mVisitor == null){
Log.e("mVisitor","mVisitor");
} else if(mVisitor.mVisitorId == null){
Log.e("mVisitor.mVisitorId","mVisitor.mVisitorId");
}
mWebsite.mVisitors.put(mVisitor.mVisitorId, mVisitor);
} else if(localName.equals("info") && isVisitor) {
mVisitor.mVisitorInfo = mVisitorInfo;
}
}
@Override
public void endDocument() throws SAXException {
}
}
**
EDIT: AFTER THOUGHTS..
**
After further investigating I found out that my parsing wasn't causing the exception. Every time I receive a stream from the socket I store it in a String and I keep appending that till we get "\n" in the stream. The "\n" is used to denote the end of a block of xml. The string is causing the memory exception. I tried the StringBuilder but that also caused the same problem. I dont know why this is happening.
Now I tried sending the inputstream directly for parsing but "\n" at the end causes a parse exception. Is there anything we can set so that the parser will ignore "\n"?