I am running into a strange problem failing to inflate/uncompress lzo compressed data in java which was deflated/compressed from python lzo module although both seem to be using the same native lzo codec implementation. To give more details, I am using the python module from here:
https://github.com/jd-boyd/python-lzo
and compressing a simple byte "a" yields
import lzo
lzo.compress("a")
> '\xf0\x00\x00\x00\x01\x12a\x11\x00\x00'
and compressing the same byte "a" in java using
https://github.com/twitter/hadoop-lzo
yields
byte[] b = new byte[1];
b[0] = 'a'
ByteArrayInputStream inputByteStream = new ByteArrayInputStream(b);
ByteArrayOutputStream outputByteStream = new ByteArrayOutputStream();
LzoCodec lzoCodec = new LzoCodec();
Configuration conf = new Configuration();
lzoCodec.setConf(conf);
OutputStream outputStream = lzoCodec.createOutputStream(outputByteStream);
int data = inputByteStream.read();
while (data != -1) {
outputStream.write(data);
data = inputByteStream.read();
}
StringBuilder sb = new StringBuilder();
for (byte b : outputByteStream.toByteArray()) {
sb.append(String.format("%02X ", b));
}
System.err.println(sb.toString());
> 00 00 00 01 00 00 00 05 12 61 11 00 00
The trailing part looks similar i.e. the part [ 11 00 00 ] but header definitely looks off. I made sure that both python and java are using lzo version 2.03 and default compression strategy in both python and java is LZO1X_1. Any help will be appreciated.