0

I have a problem with GZip in Java. Currently i work with files that are gzipped. One file in one gzip archive. And if i decompress them manually and then parse them everything works. But i want to automate this with Java and GZipInputStream but it doesn't work. I need to have DataInputStream at the end. My code is:

    byte[] bytesArray = Files.readAllBytes(baseFile.toPath());

    try {
        reader = new DataInputStream(new GZIPInputStream(new ByteArrayInputStream(bytesArray)));
        System.out.println("gzip");
    } catch (ZipException notZip) {
        reader = new DataInputStream(new ByteArrayInputStream(bytesArray));
        System.out.println("no gzip");
    }

I also tried new GZIPInputStream(new FileInputStream(baseFile)); The result is the same. Due to output i see that Gzip stream creates without exception but later i get invalid data from DataInputStream. Please help :)

kapodes
  • 316
  • 2
  • 7

2 Answers2

0

I ran the following code without problems

public static void main(String[] args) throws IOException {
    byte[] originalBytesArray = Files.readAllBytes(new File("OrdLog.BR-1.17.2016-09-12.bin").toPath());
    byte[] bytesArray = Files.readAllBytes(new File("OrdLog.BR-1.17.2016-09-12.bin.gz").toPath());
    DataInputStream reader = null;
    try {
        reader = new DataInputStream(new GZIPInputStream(new ByteArrayInputStream(bytesArray)));
        System.out.println("gzip");
    } catch (ZipException notZip) {
        reader = new DataInputStream(new ByteArrayInputStream(bytesArray));
        System.out.println("no gzip");
    }
    byte[] uncompressedBytesArray = new byte[originalBytesArray.length];
    reader.readFully(uncompressedBytesArray);
    reader.close();
    boolean filesDiffer = false;
    for (int i = 0; i < uncompressedBytesArray.length; i++) {
        if (originalBytesArray[i] != uncompressedBytesArray[i]) {
            filesDiffer = true;
        }
    }
    System.out.println("Files differ: " + filesDiffer);
}

It reads the gzip file and the uncompressed file and compares the content. It prints Files differ: false. If it doesn't for your files than the files are not the same.

Guenther
  • 2,035
  • 2
  • 15
  • 20
  • My problem is that i use .readByte() method and it seems to read different data that if i use uncompressed source. Can you test this method and compare it with original file? – kapodes Sep 15 '16 at 08:03
  • I ran your test: gzip Files differ: true. 7zip uncomplresses file without problems and says that it is a gzip archive. And i don't get an exception. – kapodes Sep 15 '16 at 08:26
  • I was going to ask for the file :-) Thx for providing it. I made a mistake when reading the compressed file. I changed it to use readFully to make the code easier. It doesn't show any difference – Guenther Sep 15 '16 at 08:43
  • can you test .readByte() specificly? while i try your method) – kapodes Sep 15 '16 at 10:58
  • it's better for me to use it as a stream not as array for later parsing. – kapodes Sep 15 '16 at 11:04
  • I checked like reader.readByte() != originalBytesArray[i] and it went well: Very strange but thanks for your help! – kapodes Sep 15 '16 at 11:08
  • your method has problems when you don't know original file size. turns out that gzip doesn't provide it in a simple way and gzipstream.avaliable() returns only 1 if there is data and 0 if not, so you can't prepare an array for .readFully() :( – kapodes Sep 15 '16 at 11:47
  • I only used readFully to simplify the code. See here http://stackoverflow.com/questions/1264709/convert-inputstream-to-byte-array-in-java how to read the input stream into a byte array. – Guenther Sep 15 '16 at 11:55
0

My final solution:

    try {
        byte[] gzipBytes = new byte[getUncompressedFileSize()];
        new DataInputStream(new GZIPInputStream(new FileInputStream(baseFile))).readFully(gzipBytes);
        reader = new DataInputStream(new ByteArrayInputStream(gzipBytes));
    } catch (ZipException notZip) {
        byte[] bytesArray = Files.readAllBytes(baseFile.toPath());
        reader = new DataInputStream(new ByteArrayInputStream(bytesArray));
    }

private int getUncompressedFileSize() throws IOException {
    //last 4 bytes of file is size of original file if it is less than 2GB
    RandomAccessFile raf = new RandomAccessFile(baseFile, "r");
    raf.seek(raf.length() - 4);
    int b4 = raf.read();
    int b3 = raf.read();
    int b2 = raf.read();
    int b1 = raf.read();
    int val = (b1 << 24) | (b2 << 16) + (b3 << 8) + b4;
    raf.close();
    return val;
}
kapodes
  • 316
  • 2
  • 7