1

I'd like to break up a single zip file that holds multiple files and create a zip file for each of the ZipEntries. That is: MainZipFile.zip ---> Zip1.zip, Zip2.zip, Zip3.zip Where the (right-hand-side) RHS zip files correspond to ZipEntrys inside MainZipFile.zip.

I'm using Java, and I'd like to do it without decompressing and re-compressing, which is the obvious solution. I'm not even sure if this is even practically possible.

I was thinking of creating empty RHS zip files, then using ZipInputStream to read the contents of an ZipEntry in MainZipFile.zip and deposit the contents into the corresponding RHS zip file, but I can't figure it out.

Edit 1:

final int BUFFER_SIZE_IN_BYTES=2*1024; //2KB
void tryReading(){
  String filepath = getTempFilepath();
  ZipFile mainZipFile = new ZipFile(filepath);
  InputStream inForEntry = getInputStreamForEntryWithName(checkpoint.getFilePath(), mainZipFile);
  ZipInputStream zin = new ZipInputStream(inForEntry);
  byte[] data = new byte[BUFFER_SIZE_IN_BYTES];
  int countOfBytesRead = zin.read(data); //returns -1
  zin.getNextEntry();
  countOfBytesRead = zin.read(data);//returns -1
}

private InputStream getInputStreamForEntryWithName(String name, ZipFile mainZipFile) throws IOException {
    ZipEntry entry = mainZipFile.getEntry(name);
    return mainZipFile.getInputStream(entry);
}
joshuar
  • 187
  • 2
  • 10
  • Using ZipInputStream seems like the right approach. Which part of it is giving you trouble? – VGR Dec 09 '15 at 22:02
  • @VGR , when I try to read from the ZipInputStream `zin`, I get -1 immediately. When I read from the InputStream `inForEntry` that I used it works, but this stream is uncompressed. – joshuar Dec 09 '15 at 22:23
  • This seems to be a good example: http://stackoverflow.com/questions/243992/how-to-split-a-huge-zip-file-into-multiple-volumes – vk239 Dec 09 '15 at 22:57

3 Answers3

0

From the JavaDoc on the ZipInputStream read()

Reads from the current ZIP entry into an array of bytes. If len is not zero, the method blocks until some input is available; otherwise, no bytes are read and 0 is returned.

Returns: the actual number of bytes read, or -1 if the end of the entry is reached

I believe you're reading through the whole file and reaching the end. Wrap your read in a loop and read through a byte at a time and see if you can get it to output anything.

Zak
  • 1
  • 2
  • Even if I put it in a loop, I'd still have to read the first 2KB of the file. I'm not sure why it's returning -1. The zip file is ~44MB. I read the first 2KB just to make sure that it's reading anything at all, and it doesn't seem to be. – joshuar Dec 09 '15 at 23:01
0

The InputStream you get from the ZipEntry gives you the result of decompressing the zip entry. Wrapping a ZipInputStream around it doesn't make any sense. The input the InputStream provides to the ZipInputStream isn't zipped data, it is plaintext. It can't possibly work.

In any case the data provided by ZipInputStream is more plaintext, not zipped data, so it can't possibly meet your objective.

You will have to read the InputStream and create a new zip file from it in the normal way you create zip files.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • I don't want to decompress it though. I want to take out the ZipEntry uncompressed and then store it as a zip file. If I simply write the InputStream from ZipEntry, I'd be extracting (decompressing) the ZipEntry. I'm hoping I can take it out without decompressing, but I don't know if that'll be possible. – joshuar Dec 09 '15 at 23:28
  • You can't do that at all, let alone the way you're trying to do it, which doesn't make sense. You're wrapping a `ZipInputStream` around a stream that delivers plaintext, and even if that works, which it doesn't, the `ZipInputStream` would deliver you plaintext, not zipped data. You have to read the `InputStream` and create a new zip file from it in the normal way you create zip files. – user207421 Dec 09 '15 at 23:30
  • I'll accept the answer that it can't be done easily. And I'm seeing now that my way won't work, but how do you know that it can't be done at all? I'm asking because I don't see why it would not be possible. I don't know a lot about zip file structures. – joshuar Dec 09 '15 at 23:36
  • It can't be done at all with Java API. Maybe you could write something from scratch that looks inside the ZIP file, maybe not. – user207421 Dec 09 '15 at 23:39
0

A better alternative seems to use a zip file system to "file copy" a selection of files from one to another zip.

This zip file system is already provided in Java SE for the jar: protocol. So if you have an URI file:/xxx/yyy.zip make an URI jar:file:/xxx/yyy.zip.

URI sourceZipURI = URI.create("jar:" + sourceZip); // "jar:file:/.../... .zip"
URI targetZipURI = URI.create("jar:" + targetZip); // 

Map<String, Object> senv = new HashMap<>(); 
FileSystem sourceZipFS = FileSystems.newFileSystem(sourceZipURI, senv, null);

Map<String, Object> tenv = new HashMap<>(); 
tenv.put("create", "true");
FileSystem targetZipFS = FileSystems.newFileSystem(targetZipURI, tenv, null);

// Example of a file copy:
Path sourcePath = sourceZipFS.getPath("/file1.txt");
Path targetPath = targetZipFS.getPath("/file1.txt"); // Or"/"
Files.copy(sourcePath, targetPath);

// Example of a zip traversal:
Files.walkFileTree(sourceZipFS.getPath("/"),
    new SimpleFileVisitor<Path>() {

        @Override
        public FileVisitResult visitFile(Path file,
                    BasicFileAttributes attrs)
                      throws IOException {
             ...
             return FileVisitResult.CONTINUE;
        }
    });

sourceZipFS.close();
targetZipFS.close();

A ZipInputStream/ZipOutputStream is possible too. For that search for some example code; as correct code would take some changes/additions.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • Oh, this looks promising! Do you know if I can buffer the copy process? I'd like to be able to stop and resume. – joshuar Dec 09 '15 at 23:57
  • @joshuar Note that this still decompresses and recompresses. There's no apparent way around that. – user207421 Dec 10 '15 at 00:03
  • The method visitFile could just compose a `List` and then you could do your own stop/resume file by file. Or make your own copy, by using your own BufferedInputStream, see Files. – Joop Eggen Dec 10 '15 at 00:04
  • @EJP, do you have a source that it does decompress and recompress? I don't think it's obvious from the code. – joshuar Dec 10 '15 at 00:13