Avoid OutOfMemoryError When Download Large Files From Azure Blob Storage

Question

I'm getting a java.lang.OutOfMemoryError: when I try to download large files (>200MB) from the web application that i working on.

The flow to download is the next:

Main method:

public byte[] getFileBytes(@RequestBody ZeusRequestVO<String> request) {
    return documentService.downloadFileByChunks(request).toByteArray();
}

Download logic:

public ByteArrayOutputStream downloadFileByChunks(String blobName) {
    long file_size = 0;
    long chunkSize = 10 * 1024 * 1024;
    CloudBlockBlob blob = connectAndgetCloudBlockBlob(blobName);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    try {
        if (blob.exists()) {
            blob.downloadAttributes();
            file_size = blob.getProperties().getLength();
            for (long i = 0; i < file_size; i += chunkSize) {
                blob.downloadRange(i, file_size, baos);
            }
        }
    } catch (StorageException e) {
        throw new GenericException(e, BusinesErrorEnum.AZURE_BLOB_STORAGE_EXCEPTION);
    }
    return baos;
}

I already add -Xms and -Xmx config in my app and that works while files not passed 200MB, in fact initially the web app wasn't capable to donwload files larges than 30MB until the -Xms and -Xmx configuration was added.

I see a solution here but i'm not able to update or add more libraries than existing (company policies).

Any advices?

My configuration for both properties is: `-Xms1024m` and `-Xmx6144m` — Gonzalo León, Mar 07 '23 at 21:07

score 1 · Answer 1 · edited May 18 '23 at 20:40

One way to avoid this error is to download the file in smaller chunks and write it directly to disk instead of keeping it all in memory. You can do this by replacing the ByteArrayOutputStream with a FileOutputStream that writes to a temporary file on disk. Here’s an example:

public void downloadFileByChunks(String blobName, String filePath) {
    long chunkSize = 10 * 1024 * 1024;
    CloudBlockBlob blob = connectAndgetCloudBlockBlob(blobName);
    try (FileOutputStream fos = new FileOutputStream(filePath)) {
        if (blob.exists()) {
            blob.downloadAttributes();
            long fileSize = blob.getProperties().getLength();
            for (long i = 0; i < fileSize; i += chunkSize) {
                blob.downloadRange(i, Math.min(chunkSize, fileSize - i), fos);
            }
        }
    } catch (StorageException | IOException e) {
        throw new GenericException(e, BusinessErrorEnum.AZURE_BLOB_STORAGE_EXCEPTION);
    }
}

This method takes in an additional parameter filePath, which specifies where on disk to save the downloaded file. The method downloads the file in chunks and writes each chunk directly to disk using a FileOutputStream. This way, you can avoid keeping the entire file in memory and reduce the risk of running into an OutOfMemoryError.

This solution is fine while the downloaded files are storaged in some physical place, but i need to implement a solution which download the files through browser. How can i implement a similar solution with the steps that i use to download files? — Gonzalo León, Mar 30 '23 at 17:43

score 0 · Answer 2 · answered Apr 04 '23 at 18:10

Finally solved, my solution was a mix from Ramprasad's answer and this answer from another question.

Basically the solution download the file in a physical location, in this case in the cluster where the app lives.

I change the main method:

public byte[] getFileBytes(@RequestBody ZeusRequestVO<String> request) {
return documentService.downloadFileByChunks(request).toByteArray();
}

For this:

public void getFileBytes(@RequestBody ZeusRequestVO<String> request, HttpServletResponse response) {
    FileInputStream fis = null;
    String fileName = null;
    try {
        fileName = documentService.getFileBytes(request);
        fis = new FileInputStream(fileName);
        IOUtils.copy(fis, response.getOutputStream());
        response.flushBuffer();
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            if(ValidationUtil.isNotNull(fis)) {
                fis.close();
            }
            File file = new File(fileName);
            if(file.exists()) {
                file.delete();
            }
        } catch(IOException ioe) {
            ioe.printStackTrace();
        }
    }
}

The logic is download the file in a physical location, then it's downloaded through the browser from here avoiding memory leaks and finally the created file is deleted.

Avoid OutOfMemoryError When Download Large Files From Azure Blob Storage

2 Answers2