Wow old question, but stumbled upon this today.
Probably some libs like zip4j can handle this, but you can get the job done with no external dependencies since Java 11:
If you are interested only in compressing data, you can just do:
void compress(ByteBuffer src, ByteBuffer dst) {
var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
try {
def.setInput(src);
def.finish();
def.deflate(dst, Deflater.SYNC_FLUSH);
if (src.hasRemaining()) {
throw new RuntimeException("dst too small");
}
} finally {
def.end();
}
}
Both src and dst will change positions, so you might have to flip them after compress returns.
In order to recover compressed data:
void decompress(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
var inf = new Inflater(true);
try {
inf.setInput(src);
inf.inflate(dst);
if (src.hasRemaining()) {
throw new RuntimeException("dst too small");
}
} finally {
inf.end();
}
}
Note that both methods expect (de-)compression to happen in a single pass, however, we could use slight modified versions in order to stream it:
void compress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) {
var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
try {
def.setInput(src);
def.finish();
int cmp;
do {
cmp = def.deflate(dst, Deflater.SYNC_FLUSH);
if (cmp > 0) {
sink.accept(dst.flip());
dst.clear();
}
} while (cmp > 0);
} finally {
def.end();
}
}
void decompress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) throws DataFormatException {
var inf = new Inflater(true);
try {
inf.setInput(src);
int dec;
do {
dec = inf.inflate(dst);
if (dec > 0) {
sink.accept(dst.flip());
dst.clear();
}
} while (dec > 0);
} finally {
inf.end();
}
}
Example:
void compressLargeFile() throws IOException {
var in = FileChannel.open(Paths.get("large"));
var temp = ByteBuffer.allocateDirect(1024 * 1024);
var out = FileChannel.open(Paths.get("large.zip"));
var start = 0;
var rem = ch.size();
while (rem > 0) {
var mapped=Math.min(16*1024*1024, rem);
var src = in.map(MapMode.READ_ONLY, start, mapped);
compress(src, temp, (bb) -> {
try {
out.write(bb);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
rem-=mapped;
}
}
If you want fully zip compliant data:
void zip(ByteBuffer src, ByteBuffer dst) {
var u = src.remaining();
var crc = new CRC32();
crc.update(src.duplicate());
writeHeader(dst);
compress(src, dst);
writeTrailer(crc, u, dst);
}
Where:
void writeHeader(ByteBuffer dst) {
var header = new byte[] { (byte) 0x8b1f, (byte) (0x8b1f >> 8), Deflater.DEFLATED, 0, 0, 0, 0, 0, 0, 0 };
dst.put(header);
}
And:
void writeTrailer(CRC32 crc, int uncompressed, ByteBuffer dst) {
if (dst.order() == ByteOrder.LITTLE_ENDIAN) {
dst.putInt((int) crc.getValue());
dst.putInt(uncompressed);
} else {
dst.putInt(Integer.reverseBytes((int) crc.getValue()));
dst.putInt(Integer.reverseBytes(uncompressed));
}
So, zip imposes 10+8 bytes of overhead.
In order to unzip a direct buffer into another, you can wrap the src buffer into an InputStream:
class ByteBufferInputStream extends InputStream {
final ByteBuffer bb;
public ByteBufferInputStream(ByteBuffer bb) {
this.bb = bb;
}
@Override
public int available() throws IOException {
return bb.remaining();
}
@Override
public int read() throws IOException {
return bb.hasRemaining() ? bb.get() & 0xFF : -1;
}
@Override
public int read(byte[] b, int off, int len) throws IOException {
var rem = bb.remaining();
if (rem == 0) {
return -1;
}
len = Math.min(rem, len);
bb.get(b, off, len);
return len;
}
@Override
public long skip(long n) throws IOException {
var rem = bb.remaining();
if (n > rem) {
bb.position(bb.limit());
n = rem;
} else {
bb.position((int) (bb.position() + n));
}
return n;
}
}
and use:
void unzip(ByteBuffer src, ByteBuffer dst) throws IOException {
try (var is = new ByteBufferInputStream(src); var gis = new GZIPInputStream(is)) {
var tmp = new byte[1024];
var r = gis.read(tmp);
if (r > 0) {
do {
dst.put(tmp, 0, r);
r = gis.read(tmp);
} while (r > 0);
}
}
}
Of course, this is not cool since we are copying data to a temporary array, but nevertheless, it is sort of a roundtrip check that proves that nio-based zip encoding writes valid data that can be read from standard io-based consumers.
So, if we just ignore crc consistency checks we can just drop header/footer:
void unzipNoCheck(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
src.position(src.position() + 10).limit(src.limit() - 8);
decompress(src, dst);
}