I'm doing multiple parallel HTTP range requests and want to calculate the MD5 sum of each response using DigestInputStream
. I also want to write the data from the HTTP stream to a file without creating intermediate files. Therefore I'm using FileChannel
and access regions of a file. This is basically a Download Manager application.
Saving the HTTP Stream to the file is working, but if I try to use the DigestInputStream
to calculate the MD5 sum on the fly, it seems the DigestInputStream
is never read. I'm probably missing some important part of how FileChannel
uses the InputStream
and I hope this can be easily fixed.
I'd also be glad for suggestions for optimizations to achieve the goal outlined above.
Here's the class implementing the download tasks
private class MultiHttpClientConnThread extends Thread {
private final Logger logger = Logger.getLogger(getClass());
private final CloseableHttpClient client;
private final HttpGet get;
private final String md5sum;
private File destinationFile;
public MultiHttpClientConnThread(final CloseableHttpClient client, final HttpGet get, final File destinationFile) {
this.client = client;
this.get = get;
this.destinationFile = destinationFile;
}
@Override
public final void run() {
try {
logger.debug("Thread Running: " + getName());
CloseableHttpResponse response = client.execute(get);
String contentRange = response.getFirstHeader("Content-Range").getValue();
Long startByte = Long.parseLong(contentRange.split("[ -]")[1]);
Long length = response.getEntity().getContentLength();
InputStream inputStream = response.getEntity().getContent();
ReadableByteChannel readableByteChannel;
MessageDigest messageDigest = MessageDigest.getInstance("MD5");
DigestInputStream digestInputStream = new DigestInputStream(inputStream, messageDigest);
readableByteChannel = Channels.newChannel(digestInputStream);
RandomAccessFile randomAccessFile = new RandomAccessFile(destinationFile, "rw");
FileChannel fileChannel = randomAccessFile.getChannel();
fileChannel.transferFrom(readableByteChannel, startByte, length);
md5sum = Hex.encodeHexString(messageDigest.digest());
logger.info("Part MD5 sum: " + md5sum);
logger.debug("Thread Finished: " + getName());
response.close();
fileChannel.close();
randomAccessFile.close();
} catch (final ClientProtocolException ex) {
logger.error("", ex);
} catch (final IOException ex) {
logger.error("", ex);
} catch (final NoSuchAlgorithmException ex) {
logger.error("", ex);
}
}
}
Update
This is a bit embarassing as the code seems to be working fine. The problem was with the uploaded file I used for testing. As asked in How to create a repeatable incompressible fast InputStream in Java? I required a repeatable random input stream and I used the one from here which unfortunately seems to repeat itself. Therefore all threads had the same data and were providing the same MD5 sums and the MD5 sums were looking very similar (but not identical) to an empty file.