I was trying to compress a file using the following code. The compression works fine when the size of the file is small(say 1 GB). But when the size of the file is around 5GB the program does not fail rather it keeps on running for 2 days with out any result. Based on the info message I get it seems like cluster issue although I am not sure enough.
Following is the code the error I am getting:
Error
Code I am using
public void compressData(final String inputFilePath,final String outputPath) throws DataFabricAppendException {
CompressionOutputStream compressionOutputStream = null;
FSDataOutputStream fsDataOutputStream = null;
FSDataInputStream fsDataInputStream = null;
CompressionCodec compressionCodec = null;
CompressionCodecFactory compressionCodecFactory = null;
try {
compressionCodecFactory = new CompressionCodecFactory(conf);
final Path compressionFilePath = new Path(outputPath);
fsDataOutputStream = fs.create(compressionFilePath);
compressionCodec = compressionCodecFactory
.getCodecByClassName(BZip2Codec.class.getName());
compressionOutputStream = compressionCodec
.createOutputStream(fsDataOutputStream);
fsDataInputStream = new FSDataInputStream(fs.open(new Path(
inputFilePath)));
IOUtils.copyBytes(fsDataInputStream, compressionOutputStream, conf,
false);
compressionOutputStream.finish();
} catch (IOException ex) {
throw new DataFabricAppendException(
"Error while compressing non-partitioned file : "
+ inputFilePath, ex);
} catch (Exception ex) {
throw new DataFabricAppendException(
"Error while compressing non-partitioned file : "
+ inputFilePath, ex);
} finally {
try {
if (compressionOutputStream != null) {
compressionOutputStream.close();
}
if (fsDataInputStream != null) {
fsDataInputStream.close();
}
if (fsDataOutputStream != null) {
fsDataOutputStream.close();
}
} catch (IOException e1) {
LOG.warn("Could not close necessary objects");
}
}
}