3

I'm using apache commons to unpack .tgz files. I am getting an error from the compress library. I have tried compress versions 1.9 and 1.8.1 and I am still having the same errors.

This is only happening on certain files but the kicker is when I download the file manually and validate it, there is no issues.

$ gunzip -c foo.tgz | tar t > /dev/null
$ 

However I am seeing this stack trace come from the commons library.

Severe:   java.io.IOException: Error detected parsing the header
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:257)    
...
Caused by: java.lang.IllegalArgumentException: Invalid byte 0 at offset 5 in '05412{NUL}11' len=8
at org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:138)

The issue is coming from the line

entry = tarIn.getNextTarEntry();

Here's the code:

try {
        TarArchiveInputStream tarIn = new TarArchiveInputStream(
                new GZIPInputStream(
                        new BufferedInputStream(
                                new FileInputStream(
                                        tempDirPath + fileName))));

        TarArchiveEntry entry = tarIn.getNextTarEntry();

        while (entry != null) {
            File path = new File(tempDirPath, entry.getName());
            if (entry.isDirectory()) {
                path.mkdirs();
            } else {          
                path.createNewFile();
                byte[] read = new byte[1024];
                BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream(path));
                int len;
                while ((len = tarIn.read(read)) != -1) {
                    bout.write(read, 0, len);
                }
                bout.close();
                read = null;
            }
            entry = tarIn.getNextTarEntry();
        }
        tarIn.close();
    } catch (Throwable t) {
        t.printStackTrace();
    }
Piyush Sagar
  • 2,931
  • 23
  • 26
user2183985
  • 75
  • 2
  • 8
  • Looks as if the tar archive contains a variant Commons Compress doesn't recognize. Could you open a bug in https://issues.apache.org/jira/browse/COMPRESS with the full stack trace and ideally an archive that exposes the problem? Do you know which tar implementation has been used to create the archive? – Stefan Bodewig Feb 13 '15 at 05:07
  • Thanks for your help. I created COMPRESS-301. Also the python tarfile library was used to create the tar file. – user2183985 Feb 13 '15 at 17:54

1 Answers1

5

Try to decorate your inputStream with GzipCompressorInputStream.

TarArchiveInputStream tarIn = new TarArchiveInputStream( new GzipCompressorInputStream(new FileInputStream(fileName)))

Piyush Sagar
  • 2,931
  • 23
  • 26