15

I have retrieved a zip entry from a zip file like so.

InputStream input = params[0];
ZipInputStream zis = new ZipInputStream(input);

ZipEntry entry;
try {
    while ((entry = zis.getNextEntry())!= null) {

    }
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

This works fine and its getting my ZipEntry no problem.

My Question

How can I get the contents of these ZipEntries into a String as they are xml and csv files.

Nick is tired
  • 6,860
  • 20
  • 39
  • 51
StuStirling
  • 15,601
  • 23
  • 93
  • 150

5 Answers5

17

you have to read from the ZipInputStream:

StringBuilder s = new StringBuilder();
byte[] buffer = new byte[1024];
int read = 0;
ZipEntry entry;
while ((entry = zis.getNextEntry())!= null) {
      while ((read = zis.read(buffer, 0, 1024)) >= 0) {
           s.append(new String(buffer, 0, read));
      }
}

When you exit from the inner while save the StringBuilder content, and reset it.

Blackbelt
  • 156,034
  • 29
  • 297
  • 305
5

With defined encoding (e.g. UTF-8) and without creation of Strings:

import java.util.zip.ZipInputStream;
import java.util.zip.ZipEntry;
import java.io.ByteArrayOutputStream;
import static java.nio.charset.StandardCharsets.UTF_8;

try (
  ZipInputStream zis = new ZipInputStream(input, UTF_8); 
  ByteArrayOutputStream baos = new ByteArrayOutputStream()
) {
  byte[] buffer = new byte[1024];
  int read = 0;
  ZipEntry entry;
  while ((entry = zis.getNextEntry()) != null)
    while ((read = zis.read(buffer, 0, buffer.length)) > 0)
      baos.write(buffer, 0, read);
  String content = baos.toString(UTF_8.name());
}
multitask landscape
  • 8,273
  • 3
  • 33
  • 31
4

Here is the approach, which does not break Unicode characters:

final ZipInputStream zis = new ZipInputStream(new ByteArrayInputStream(content));
final InputStreamReader isr = new InputStreamReader(zis);
final StringBuilder sb = new StringBuilder();
final char[] buffer = new char[1024];

while (isr.read(buffer, 0, buffer.length) != -1) {
    sb.append(new String(buffer));
}

System.out.println(sb.toString());
Igor Bljahhin
  • 939
  • 13
  • 23
  • Could you please describe in which way the version of @Blackbelt will break Unicode? His version reads until there is nothing left (i.e. all parts of Unicode I'd expect to be there), while your version always adds the whole buffer, creating a final String which is much longer than entry.getSize() – Zefiro May 08 '17 at 19:16
  • Ok, I now see that you use char[] instead of byte[]. I'd suggest to make this more visible in the answer, and also only add as much of the buffer as you're read. Currently your resulting string gets rounded up to the buffer size. – Zefiro May 08 '17 at 19:24
1

I would use apache's IOUtils

ZipEntry entry;
InputStream input = params[0];
ZipInputStream zis = new ZipInputStream(input);

try {
  while ((entry = zis.getNextEntry())!= null) {
    String entryAsString = IOUtils.toString(zis, StandardCharsets.UTF_8);
  }
} catch (IOException e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
}
IOUtils.closeQuietly(zis);
Kreender
  • 284
  • 1
  • 6
  • 17
0

Kotlin version but can be used in Java as well

val zipInputStream = ZipInputStream(inStream)
var zipEntry = zipInputStream.nextEntry
while(zipEntry != null) {
    println("Name of file : " + zipEntry.name)
    val fileContent = String(zipInputStream.readAllBytes(), StandardCharsets.UTF_8)
    println("File content : $fileContent")
    zipEntry = zipInputStream.nextEntry
}
Manish
  • 1,452
  • 15
  • 25