2

I've written some code that retrives a zip file and unzips it into a directoy. The zip file contains two folders and depending on what folder each file is in, it is unzipped into a folder of that directory.

However, the code takes an awful long time to run (approx 10mins). Although, the folders contain nearly 1000 files each and the total size of the zip file is 5000kb. I think its going slow because I'm creating the FileOutputStream and InputStream each time I enter the loop. However, I need to do this as I don't know the output directory of a file until I read it from the zip file. (Ie. find out what folder it is in)

Any suggestions?

/**
 * Retrieves and unzips a file from its URL
 */
public void retrieveFiles(String URL) {

    //Retrieve file from URL
    File zip = new File(getFile(URL));
    zip.mkdirs();

    try {
        //Create .zip file from file directory
        ZipFile zipFile = new ZipFile(zip);
        Enumeration<? extends ZipEntry> enumeration = zipFile.entries();

        //While zip file contains elements, get the next zipped file
        while (enumeration.hasMoreElements()) {
            ZipEntry zipEntry = (ZipEntry) enumeration.nextElement();

            //Ignore folders and other zip files
            if(!zipEntry.isDirectory() && !zipEntry.getName().endsWith(".zip")){

                //Find directory and filename for new unzipped file
                String directory = getURL(zipEntry.getName());
                String fileName = getFileName(zipEntry.getName());
                String fullDirectory = createDirectory(directory, fileName);

                //Unzip file and store in directory
                System.out.println("Unzipping file: " + fileName);
                FileOutputStream fout = new FileOutputStream(fullDirectory);
                InputStream in = zipFile.getInputStream(zipEntry);
                for (int c = in.read(); c != -1; c = in.read()) {
                    fout.write(c);
                }
                zipFile.getInputStream(zipEntry).close();
                in.close();
                fout.close();
            }
        }
        zipFile.close();
        System.out.println("Unzipping complete!");

        zip.delete();

    } catch (IOException e) {
        System.out.println("Unzip failed");
        e.printStackTrace();
    }
}

2 Answers2

6

You're copying the file one byte at a time

for (int c = in.read(); c != -1; c = in.read())
    fout.write(c);
}

You could try using Apaches org.apache.commons.io.IOUtils.copy() as this'll copy in chunks and use NIO and other improvements. You can find it in the commons-io.jar

pillingworth
  • 3,238
  • 2
  • 24
  • 49
0

Try loading files into memory first, X mb at a time whatever you find suitable, and then create the IO stream to file.

WaelJ
  • 2,942
  • 4
  • 22
  • 28
  • Not sure if I know what you mean, but the zip file is downloaded into a temp location and then that file is used when unzipping the files. It doesn't take long to download the zip (getFile(URL) method) –  Sep 01 '11 at 14:02
  • I was thinking that maybe you could unzip the file one large chunk at a time and then writing that chunk to file. That way you don't have to create a FileOutputStream for every file. – WaelJ Sep 01 '11 at 14:14