8

How to check whether file is gzip or not in java. I checked by reading first 2 bytes and comparing with magic code. But for large size of file getting OutOfMemoryError.

Any one knows other way to do this?

This is the code I am using:

def isGzipCompressionFile(File file)
{
   return ((file.bytes[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (file.bytes[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8)))
}
Nathan
  • 8,093
  • 8
  • 50
  • 76
Snehal Kulkarni
  • 105
  • 1
  • 1
  • 5

5 Answers5

8

Use this package that I found on google:

package example;
 
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.RandomAccessFile;
import java.util.zip.GZIPInputStream;
 
public class GZipUtil {
 
 /**
  * Checks if an input stream is gzipped.
  * 
  * @param in
  * @return
  */
 public static boolean isGZipped(InputStream in) {
  if (!in.markSupported()) {
   in = new BufferedInputStream(in);
  }
  in.mark(2);
  int magic = 0;
  try {
   magic = in.read() & 0xff | ((in.read() << 8) & 0xff00);
   in.reset();
  } catch (IOException e) {
   e.printStackTrace(System.err);
   return false;
  }
  return magic == GZIPInputStream.GZIP_MAGIC;
 }
 
 /**
  * Checks if a file is gzipped.
  * 
  * @param f
  * @return
  */
 public static boolean isGZipped(File f) {
  int magic = 0;
  try {
   RandomAccessFile raf = new RandomAccessFile(f, "r");
   magic = raf.read() & 0xff | ((raf.read() << 8) & 0xff00);
   raf.close();
  } catch (Throwable e) {
   e.printStackTrace(System.err);
  }
  return magic == GZIPInputStream.GZIP_MAGIC;
 }
 
 public static void main(String[] args) throws FileNotFoundException {
  File gzf = new File("/tmp/1.gz");
 
  // Check if a file is gzipped.
  System.out.println(isGZipped(gzf));
 
  // Check if a input stream is gzipped.
  System.out.println(isGZipped(new FileInputStream(gzf)));
 }
}
Luke Rixson
  • 607
  • 5
  • 20
  • 1
    If second `read()` throws `IOException`, you'd have consumed a byte. If `in` parameter doesn't support `mark`, wrapping with `BufferedInputStream` won't make a difference: You'd still consume 2 bytes from the stream given to you. In both cases, caller is unaware that `isGZipped` stole bytes. Sorry, I have to down-vote. – Andreas Jan 03 '17 at 23:35
6

Try Files.probeContentType(Path) [JDK 7]

Path source = Paths.get("D:/myfiles/a.zip");
System.out.println(Files.probeContentType(source));

output

application/x-zip-compressed
Bacteria
  • 8,406
  • 10
  • 50
  • 67
  • 1
    It checks only extension. If we change extension, the output also changes. So it's safe to check magic number to determine file type for sure. – mano_ksp Aug 17 '18 at 15:28
4

Use a gzip input stream http://docs.oracle.com/javase/7/docs/api/java/util/zip/GZIPInputStream.html . It throws an ZipException if you try to open another format. In your code you can catch this exception in a catch block.

gouessej
  • 3,640
  • 3
  • 33
  • 67
Peter Paul Kiefer
  • 2,114
  • 1
  • 11
  • 16
  • 3
    Using exceptions to control program flow is generally frowned upon last I checked. – Matt Lachman May 17 '17 at 17:49
  • 4
    @MattLachman Try telling that to python developers, see what kind of response you get. https://stackoverflow.com/questions/1265665/how-can-i-check-if-a-string-represents-an-int-without-using-try-except – jbowman May 22 '19 at 18:15
1

You should only read in 2 bytes from the file if that's all you're checking, it sounds like you're pulling the entire file into memory.

https://docs.oracle.com/javase/tutorial/essential/io/datastreams.html

Nicholas Hirras
  • 2,592
  • 2
  • 21
  • 28
1

This is what I am using

private static void decompressGzipFile(String gzipFilePath, String newFilePath) {
        try {
            FileInputStream fis = new FileInputStream(gzipFile);
            GZIPInputStream gis = new GZIPInputStream(fis);
            // If this line does not throw exception your file is GZip
            // Your logic


        } catch (IOException e) {
            //Not in GZip Format
        }

    }
Ajay Sainy
  • 279
  • 1
  • 9
  • 21