My code needs to download a big xml file (500MB) inside a GZIPInputStream and process it doing some operations for every object. Those operations take time to be completed, and I have many objects to process. I'm using commons http-client 3.1 and stax.
public void download(String url) throws HttpException, IOException,
XMLStreamException, FactoryConfigurationError {
GetMethod getMethod = new GetMethod(url);
try {
httpClient.executeMethod(getMethod);
Header contentEncoding = getMethod.getResponseHeader("Content-Encoding");
if (contentEncoding != null) {
String acceptEncodingValue = contentEncoding.getValue();
if (acceptEncodingValue.indexOf("gzip") != -1) {
processStream(new GZIPInputStream(getMethod.getResponseBodyAsStream()));
return;
}
}
processStream(getMethod.getResponseBodyAsStream());
return;
} finally {
getMethod.releaseConnection();
}
}
protected void processStream(InputStream inputStream) throws XMLStreamException, FactoryConfigurationError {
XMLStreamReader xmlStreamReader = XMLInputFactory.newFactory().createXMLStreamReader(inputStream);
//parses xml with Stax
//executes some long operations for each object
}
When I run the code it works till, after two or three hours, I get a SocketException: Connection reset
.
Looks like the server has closed the connection, is it correct? Is there a way to avoid this error without any change on server-side? If not, how can I deal with it to avoid re-running my application from the beginning?
com.ctc.wstx.exc.WstxIOException: Connection reset
at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
.................
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.java:182)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:108)
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:221)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:92)
at java.io.FilterInputStream.read(FilterInputStream.java:90)
at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:1034)
at com.ctc.wstx.sr.StreamScanner.getNextChar(StreamScanner.java:794)
at com.ctc.wstx.sr.BasicStreamReader.parseNormalizedAttrValue(BasicStreamReader.java:1900)
at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3037)
at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2936)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)