0

I am working in a feature for an LMS to download a bunch of selected files and folders in a zip on-the-fly. I have used ZipOutputStream to prevent OutOfMemory issues.

The feature works nice, but we have done a stress test and when several users are downloading zips at the same time (lets say 10 users zipping about 100 MB each one), 4 out of 4 CPUs reach 100% of load until the zips are created. Our system admins think that this is not acceptable.

I wonder if there is some mechanism to do ZipOutputStream use less system resources, no matter if it takes more time to finish.

My current code:

protected void compressResource(ZipOutputStream zipOut, String collectionId, String rootFolderName, String resourceId) throws Exception
{
    if (ContentHostingService.isCollection(resourceId))
    {
        try
        {
            ContentCollection collection = ContentHostingService.getCollection(resourceId);
            List<String> children = collection.getMembers();
            if(children != null)
            {
                for(int i = children.size() - 1; i >= 0; i--)
                {
                    String child = children.get(i);
                    compressResource(zipOut,collectionId,rootFolderName,child);
                }
            }
        }
        catch (PermissionException e)
        {
            //Ignore
        }
    }
    else
    {
        try
        {
            ContentResource resource = ContentHostingService.getResource(resourceId);
            String displayName = isolateName(resource.getId());
            displayName = escapeInvalidCharsEntry(displayName);

            InputStream content = resource.streamContent();
            byte data[] = new byte[1024 * 10];
            BufferedInputStream bContent = null;

            try
            {
                bContent = new BufferedInputStream(content, data.length);

                String entryName = (resource.getContainingCollection().getId() + displayName);
                entryName=entryName.replace(collectionId,rootFolderName+"/");
                entryName = escapeInvalidCharsEntry(entryName);

                ZipEntry resourceEntry = new ZipEntry(entryName);
                zipOut.putNextEntry(resourceEntry); //A duplicate entry throw ZipException here.
                int bCount = -1;
                while ((bCount = bContent.read(data, 0, data.length)) != -1)
                {
                    zipOut.write(data, 0, bCount);
                }

                try
                {
                    zipOut.closeEntry();
                }
                catch (IOException ioException)
                {
                    logger.error("IOException when closing zip file entry",ioException);
                }
            }
            catch (IllegalArgumentException iException)
            {
                logger.error("IllegalArgumentException while creating zip file",iException);
            }
            catch (java.util.zip.ZipException e)
            {
                //Duplicate entry: ignore and continue.
                try
                {
                    zipOut.closeEntry();
                }
                catch (IOException ioException)
                {
                    logger.error("IOException when closing zip file entry",ioException);
                }
            }
            finally
            {
                if (bContent != null)
                {
                    try
                    {
                        bContent.close();
                    }
                    catch (IOException ioException)
                    {
                        logger.error("IOException when closing zip file",ioException);
                    }
                }
            }
        }
        catch (PermissionException e)
        {
            //Ignore
        }
    }
}

Thanks in advance.

  • 3
    You can use a semaphore to restrict the number of concurrent users. – shmosel Feb 09 '17 at 08:53
  • 1
    Don't allow so many concurrent zip processes to occur at the same time. Use a executor to perform the ziptask, and you can adjust the amount of threads used for them. – Kayaman Feb 09 '17 at 08:53
  • Considering that you're controlling both read and write processes, `ZipOutputStream` has no relation to your issue, you can place `OutputStream` instead and the task won't change. Basically your question is similar to [this one](http://stackoverflow.com/questions/667508/whats-a-good-rate-limiting-algorithm). – user3707125 Feb 09 '17 at 09:06
  • @user3707125 Except that `OutputStream` is I/O bound and `ZipOutputStream` can be compute-bound. – user207421 Feb 09 '17 at 22:42
  • @OP How does using a `ZipOutputStream` prevent `OutOfMemoryErrors`? – user207421 Feb 09 '17 at 22:43
  • Thanks, @shmosel, the semaphore is an easy hack for this issue. – daniel.merino Feb 14 '17 at 09:28
  • @EJP I don't know the exact details but I have read that other zip methods hold the zip in memory and can cause OutOfMemory issues. StackOverflow is full of questions about these problems. – daniel.merino Feb 14 '17 at 09:28

1 Answers1

0

I have solved it with a simple hack told by @shmosel.

private static Semaphore mySemaphore= new Semaphore(ServerConfigurationService.getInt("content.zip.download.maxconcurrentdownloads",5),true);

(...)

ZipOutputStream zipOut = null;
    try
    {
        mySemaphore.acquire();
        ContentCollection collection = ContentHostingService.getCollection(collectionId);

(...)

zipOut.flush();
zipOut.close();
mySemaphore.release();

(...)

This is working in my test server. But if anybody has any objection or any extra advice, I will be happy to hear.