12

Ok so I want to read the contents of a tar.gz file (or a xy) but that's the same thing. What I am doing is more or less this:

TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream("c://temp//test.tar.gz")));
TarArchiveEntry currentEntry = tarInput.getNextTarEntry();
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
while (currentEntry != null) {
    File f = currentEntry.getFile();
    br = new BufferedReader(new FileReader(f));
    System.out.println("For File = " + currentEntry.getName());
    String line;
    while ((line = br.readLine()) != null) {
        System.out.println("line="+line);
    }
}
if (br!=null) {
    br.close();
}

But I get null when I call the getFile method of TarArchiveEntry.
I am using Apache commons compress 1.8.1

Belphegor
  • 4,456
  • 11
  • 34
  • 59
zpontikas
  • 5,445
  • 2
  • 37
  • 41

1 Answers1

27

You can't use the getFile of TarArchiveEntry. That getter is there only for the opposite operation, when you are compressing files inside a tar file.

Instead, you should read directly from TarArchiveInputStream. It will take care of returning you the content of the "file" decompressing it on the fly.

For example (untested code, YMMV) :

TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream("c://temp//test.tar.gz")));
TarArchiveEntry currentEntry = tarInput.getNextTarEntry();
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
while (currentEntry != null) {
    br = new BufferedReader(new InputStreamReader(tarInput)); // Read directly from tarInput
    System.out.println("For File = " + currentEntry.getName());
    String line;
    while ((line = br.readLine()) != null) {
        System.out.println("line="+line);
    }
    currentEntry = tarInput.getNextTarEntry(); // You forgot to iterate to the next file
}
Simone Gianni
  • 11,426
  • 40
  • 49
  • 2
    This is so counter-intuitive... Even the javadoc states "Get this entry's file." – kraxor Oct 07 '16 at 16:09
  • 1
    @kraxor This happens because a `File` object can only refer to files existing on the disk, and thus cannot be used for file names of compressed files – Ferrybig Aug 20 '17 at 20:02