34

I am trying to write a function which will accept an InputStream with zipped file data and would return another InputStream with unzipped data.

The zipped file will only contain a single file and thus there is no requirement of creating directories, etc...

I tried looking at ZipInputStream and others but I am confused by so many different types of streams in Java.

Baishampayan Ghose
  • 19,928
  • 10
  • 56
  • 60

4 Answers4

49

Concepts

GZIPInputStream is for streams (or files) zipped as gzip (".gz" extension). It doesn't have any header information.

This class implements a stream filter for reading compressed data in the GZIP file format

If you have a real zip file, you have to use ZipFile to open the file, ask for the list of files (one in your example) and ask for the decompressed input stream.

Your method, if you have the file, would be something like:

// ITS PSEUDOCODE!!

private InputStream extractOnlyFile(String path) {
   ZipFile zf = new ZipFile(path);
   Enumeration e = zf.entries();
   ZipEntry entry = (ZipEntry) e.nextElement(); // your only file
   return zf.getInputStream(entry);
}

Reading an InputStream with the content of a .zip file

Ok, if you have an InputStream you can use (as @cletus says) ZipInputStream. It reads a stream including header data.

ZipInputStream is for a stream with [header information + zippeddata]

Important: if you have the file in your PC you can use ZipFile class to access it randomly

This is a sample of reading a zip-file through an InputStream:

import java.io.FileInputStream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;


public class Main {
    public static void main(String[] args) throws Exception
    {
        FileInputStream fis = new FileInputStream("c:/inas400.zip");

        // this is where you start, with an InputStream containing the bytes from the zip file
        ZipInputStream zis = new ZipInputStream(fis);
        ZipEntry entry;
            // while there are entries I process them
        while ((entry = zis.getNextEntry()) != null)
        {
            System.out.println("entry: " + entry.getName() + ", " + entry.getSize());
                    // consume all the data from this entry
            while (zis.available() > 0)
                zis.read();
                    // I could close the entry, but getNextEntry does it automatically
                    // zis.closeEntry()
        }
    }
}
splash
  • 13,037
  • 1
  • 44
  • 67
helios
  • 13,574
  • 2
  • 45
  • 55
  • I corrected the code, the ZipInputStream had to wrap the original input stream :). Thanx! – helios Jan 14 '10 at 09:31
  • Helios: zipinput.getNextEntry() will return a ZipEntry object. How do I convert it into a stream? – Baishampayan Ghose Jan 14 '10 at 09:48
  • zipinputstream represents an inputstream of the unzipped data of the file. That's why I'm returning "zipinput". But it has to read the headers and position at the beginning of the current zipped data to start. That's why I first call "getnextentry". To make the zipinputstream read that header and prepare to unzip its entry (and of course, to know the zipped filename :). – helios Jan 14 '10 at 10:00
  • 1
    Helios: Thanks for your input so far. I have a question, when you just do a `zis.read()` where does the data go? My zip file will contain only one file in it and I just want to return a stream of the uncompressed file data. – Baishampayan Ghose Jan 14 '10 at 10:11
  • Oh, ok. zis.read() (as any InputStream.read) returns (and move forward) one byte. The other read functions work the same way reading more bytes at once. In you case you only have to: 1) get the first entry (it is... don't use a while loop) 2) return the very "zis" object: because it IS the uncompressed input stream you need. The code that works for you is the second block (the first EDIT) – helios Jan 14 '10 at 10:28
  • Hi. Your solution is too complex and confuses people. See this answer and edit your post please: http://stackoverflow.com/questions/3233555/is-it-possible-to-get-a-zipentrys-inputstream-from-a-zipinputstream/3233600#3233600 – woky Sep 29 '11 at 06:25
  • I've cleaned up the answer a little bit. Separated the "concept" part (the OP asks about what purpose are the different classes for) and the "read from a zip-file inputstream" part (the OP needs to read an stream containing not just gzipped data but the whole content of a zip-file). – helios Sep 29 '11 at 10:50
6

If you can change the input data I would suggested you to use GZIPInputStream.

GZipInputStream is different from ZipInputStream since you only have one data inside it. So the whole input stream represents the whole file. In ZipInputStream the whole stream contains also the structure of the file(s) inside it, which can be many.

nanda
  • 24,458
  • 13
  • 71
  • 90
  • 1
    The file is not in my control. It's a file that I download from a server. I used to save it to disk and then unzip it, but now I am thinking about unzipping it in memory. – Baishampayan Ghose Jan 14 '10 at 10:21
  • What matters isn't really if the bytes originate from a network socket or from a file. The distinction to be made is between a zip archive and a blob of compressed data. If you wrote and read the data, perhaps you wouldn't really care about the archive with it's metadata, and then GZipStream would be the one to go for. You are clearly receiving an archive (or else saving it to a file and unzipping it would probably fail, at least if you unzip by running a "standard" unzip program). You can indeed unzip it in memory, using ZipInputStream. – The Dag Jan 23 '14 at 14:41
6

It is on scala syntax:

def unzipByteArray(input: Array[Byte]): String = {
    val zipInputStream = new ZipInputStream(new ByteArrayInputStream(input))
    val entry = zipInputStream.getNextEntry
    IOUtils.toString(zipInputStream, StandardCharsets.UTF_8)
}
pvorb
  • 7,157
  • 7
  • 47
  • 74
Roman Kazanovskyi
  • 3,370
  • 1
  • 21
  • 22
  • This scala code is helpful to unzip java.io.InputStream but it isn't using the defined 'entry' to get the next file nor does it have a close method at the end? – puligun Apr 07 '21 at 00:42
  • @puligun Yes. You are right and it is just an answer to the question. Surely, we have to close the stream later. In other words, it is a method that you can use inside your needed case. – Roman Kazanovskyi Apr 08 '21 at 12:18
2

Unless I'm missing something, you should absolutely try and get ZipInputStream to work and there's no reason it shouldn't (I've certainly used it on several occasions).

What you should do is try and get ZipInputStream to work and if you can't, post the code and we'll help you with whatever problems you're having.

Whatever you do though, don't try and reinvent its functionality.

LadyCailin
  • 849
  • 10
  • 26
cletus
  • 616,129
  • 168
  • 910
  • 942
  • 9
    To be fair, `java.util.zip` is a pretty unpleasant API – skaffman Jan 14 '10 at 09:43
  • @skaffman yeah, you'd think they'd have a `ZipFile.unzip(destDir)` method, huh? Or a way to find entries by name/pattern easily. This API is pretty gross. – Josh M. Jan 13 '22 at 13:10