18

we are working on a program where we need to flush (force compress and send data) a GZIPOutputStream. The problem is, that the flush method of the GZIPOutputStream doesn't work as expected (force compress and send data), instead the Stream waits for more data for efficient data compression.

When you call finish the data is compressed and sent over the output stream but the GZIPOutputStream (not the underlying stream) will be closed so we cant write more data till we create a new GZIPOutputStream, which costs time and performance.

Hope anyone can help with this.

Best regards.

Audrius Meškauskas
  • 20,936
  • 12
  • 75
  • 93
Hemeroc
  • 2,176
  • 5
  • 23
  • 29

6 Answers6

12

I haven't tried this yet, and this advice won't be useful until we have Java 7 in hand, but the documentation for GZIPOutputStream's flush() method inherited from DeflaterOutputStream relies upon the flush mode specified at construction time with the syncFlush argument (related to Deflater#SYNC_FLUSH) to decide whether to flush the pending data to be compressed. This syncFlush argument is also accepted by GZIPOutputStream at construction time.

It sounds like you want to use either Deflator#SYNC_FLUSH or maybe even Deflater#FULL_FLUSH, but, before digging down that far, first try working with the two-argument or the four-argument GZIPOutputStream constructor and pass true for the syncFlush argument. That will activate the flushing behavior you desire.

seh
  • 14,999
  • 2
  • 48
  • 58
  • Hi, your answer is great if you are working with Java7 which is not released at the moment. I'm working with java6 (as most of the users do). – Hemeroc Sep 03 '10 at 23:30
  • Oh, I'm sorry about that. You're right: these signatures are not yet available in Java 6. That serves me right for reading "the latest" documentation. We'll have to wait for these to arrive. – seh Sep 03 '10 at 23:59
11

I didn't find the other answer to work. It still refused to flush because the native code that GZIPOutputStream is using holds onto the data.

Thankfully, I discovered that someone has implemented a FlushableGZIPOutputStream as part of the Apache Tomcat project. Here is the magic part:

@Override
public synchronized void flush() throws IOException {
    if (hasLastByte) {
        // - do not allow the gzip header to be flushed on its own
        // - do not do anything if there is no data to send

        // trick the deflater to flush
        /**
         * Now this is tricky: We force the Deflater to flush its data by
         * switching compression level. As yet, a perplexingly simple workaround
         * for
         * http://developer.java.sun.com/developer/bugParade/bugs/4255743.html
         */
        if (!def.finished()) {
            def.setLevel(Deflater.NO_COMPRESSION);
            flushLastByte();
            flagReenableCompression = true;
        }
    }
    out.flush();
}

You can find the entire class in this jar (if you use Maven):

<dependency>
    <groupId>org.apache.tomcat</groupId>
    <artifactId>tomcat-coyote</artifactId>
    <version>7.0.8</version>
</dependency>

Or just go and grab the source code FlushableGZIPOutputStream.java

It's released under the Apache-2.0 license.

Michael
  • 41,989
  • 11
  • 82
  • 128
Ben L.
  • 787
  • 10
  • 18
3

This code is working great for me in my application.

public class StreamingGZIPOutputStream extends GZIPOutputStream {

    public StreamingGZIPOutputStream(OutputStream out) throws IOException {
        super(out);
    }

    @Override
    protected void deflate() throws IOException {
        // SYNC_FLUSH is the key here, because it causes writing to the output
        // stream in a streaming manner instead of waiting until the entire
        // contents of the response are known.  for a large 1 MB json example
        // this took the size from around 48k to around 50k, so the benefits
        // of sending data to the client sooner seem to far outweigh the
        // added data sent due to less efficient compression
        int len = def.deflate(buf, 0, buf.length, Deflater.SYNC_FLUSH);
        if (len > 0) {
            out.write(buf, 0, len);
        }
    }

}
Matt Sgarlata
  • 1,761
  • 1
  • 16
  • 13
  • I had the exact same issue, and this solved my problem nicely! (the streaming to a client is better to start sooner) – Tony Nov 01 '16 at 14:22
  • 2
    How does this differ from setting `syncFlush` to true in the `GZIPOutputStream` constructor? – heez Sep 04 '18 at 17:36
  • 1
    That constructor requires Java 7. This might be good for older versions. – ojchase May 03 '22 at 21:42
1

There is same problem on Android also. Accepter answer doesn't work because def.setLevel(Deflater.NO_COMPRESSION); throws exception. According flush method it changes compress level of Deflater. So I suppose changing compression should be called before writing data, but I'm not sure.

There're 2 other options:

  • if API level of your app is higher that 19 then you can try to use constructor with syncFlush param
  • the other solution is using jzlib.
eleven
  • 6,779
  • 2
  • 32
  • 52
1

Bug ID 4813885 handles this issue. The comment of "DamonHD", submitted on 9 Sep 2006 (about halfway the bugreport) contains an example of FlushableGZIPOutputStream which he built on top of Jazzlib's net.sf.jazzlib.DeflaterOutputStream.

For reference, here's a (reformatted) extract:

/**
 * Substitute for GZIPOutputStream that maximises compression and has a usable
 * flush(). This is also more careful about its output writes for efficiency,
 * and indeed buffers them to minimise the number of write()s downstream which
 * is especially useful where each write() has a cost such as an OS call, a disc
 * write, or a network packet.
 */
public class FlushableGZIPOutputStream extends net.sf.jazzlib.DeflaterOutputStream {
    private final CRC32 crc = new CRC32();
    private final static int GZIP_MAGIC = 0x8b1f;
    private final OutputStream os;

    /** Set when input has arrived and not yet been compressed and flushed downstream. */
    private boolean somethingWritten;

    public FlushableGZIPOutputStream(final OutputStream os) throws IOException {
        this(os, 8192);
    }

    public FlushableGZIPOutputStream(final OutputStream os, final int bufsize) throws IOException {
        super(new FilterOutputStream(new BufferedOutputStream(os, bufsize)) {
            /** Suppress inappropriate/inefficient flush()es by DeflaterOutputStream. */
            @Override
            public void flush() {
            }
        }, new net.sf.jazzlib.Deflater(net.sf.jazzlib.Deflater.BEST_COMPRESSION, true));
        this.os = os;
        writeHeader();
        crc.reset();
    }

    public synchronized void write(byte[] buf, int off, int len) throws IOException {
        somethingWritten = true;
        super.write(buf, off, len);
        crc.update(buf, off, len);
    }

    /**
     * Flush any accumulated input downstream in compressed form. We overcome
     * some bugs/misfeatures here so that:
     * <ul>
     * <li>We won't allow the GZIP header to be flushed on its own without real compressed
     * data in the same write downstream. 
     * <li>We ensure that any accumulated uncompressed data really is forced through the 
     * compressor.
     * <li>We prevent spurious empty compressed blocks being produced from successive 
     * flush()es with no intervening new data.
     * </ul>
     */
    @Override
    public synchronized void flush() throws IOException {
        if (!somethingWritten) { return; }

        // We call this to get def.flush() called,
        // but suppress the (usually premature) out.flush() called internally.
        super.flush();

        // Since super.flush() seems to fail to reliably force output, 
        // possibly due to over-cautious def.needsInput() guard following def.flush(),
        // we try to force the issue here by bypassing the guard.
        int len;
        while((len = def.deflate(buf, 0, buf.length)) > 0) {
            out.write(buf, 0, len);
        }

        // Really flush the stream below us...
        os.flush();

        // Further flush()es ignored until more input data data written.
        somethingWritten = false;
    }

    public synchronized void close() throws IOException {
        if (!def.finished()) {
            def.finish();
            do {
                int len = def.deflate(buf, 0, buf.length);
                if (len <= 0) { 
                    break;
                }
                out.write(buf, 0, len);
            } while (!def.finished());
        }

        // Write trailer
        out.write(generateTrailer());

        out.close();
    }

    // ...
}

You may find it useful.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
0

as @seh said, this works great:

ByteArrayOutputStream stream = new ByteArrayOutputStream();

// the second param need to be true
GZIPOutputStream gzip = new GZIPOutputStream(stream,  true);
gzip.write( .. );
gzip.flush();

...
gzip.close()
kissLife
  • 307
  • 1
  • 2
  • 9