4

Is it possible to validate a tar file in Java so that I can be sure other tar utilities (i.e UNIX tar command) can extract the tar? I would like to do this in code and not resort to actually executing tar -tf from my application and report based on the return code.

I have been using Apache commons compress to read through a tars entries (not extract them). I have manually modified some of the tar using a text editor to represent a "corrupt" tar file. Commons compress doesn't always fail to read the entries where as the UNIX tar command or 7zip have failed consistently no matter what I modify. Obviously something more needs to be done to actually ensure the tar file is not corrupt, but what is it?

Note: I haven't actually tried to extract the tar via commons compress as a means of validation. I don't think this is really an option; the tar file sizes I have to work with are large (3 to 4 gigs) and I would rather not see this process take any longer than it already does.

Adam McCormick
  • 315
  • 5
  • 10

1 Answers1

1

Since you don't want to execute the tar command, you could use the org.apache.tools.tar.TarInputStream implementation of Apache ANT.

Personaly, I have not tried it and I'm not an expert in the TAR file format, but I would try iterating over the TAR file using the getNextEntry method. Assuming that once you're done with the all without an error/exception, you have (somewhat) proven that the TAR file is "valid".

UPDATE: Actually the above Wikipedia page lists those Java implementations:

Jtar includes an example about untaring a TAR file, which I guess, could be modified as to just "test" the archive.

   String tarFile = "c:/test/test.tar";

   // Create a TarInputStream
   TarInputStream tis = new TarInputStream(new BufferedInputStream(new FileInputStream(tarFile)));
   TarEntry entry;
   while((entry = tis.getNextEntry()) != null) {

      System.out.println(entry.getName());

      // The following might not be required, but arguably does 
      // additionally check the content of the archive.
      int count;
      byte data[] = new byte[2048];

      while((count = tis.read(data)) != -1) {
      }
   }

   tis.close();
Christian.K
  • 47,778
  • 10
  • 99
  • 143