0

IOUtils.copy getting crash if size is more than 200MB, getting exception

[ERR] Resource exhaustion event: the JVM was unable to allocate memory from the heap.

We are using Java version 11 and allocated 2.5GB memory to our app container.

My code:

        InputStream inputStreamm = null;
        S3Object s3Object = null;

        try {
            s3Object = client.getObject(bucket, fileName);
            try (S3ObjectInputStream inputStream = s3Object.getObjectContent()) {
                ByteArrayOutputStream tempOutputStream = new ByteArrayOutputStream();
                IOUtils.copy(inputStream, tempOutputStream );
                inputStreamm = new ByteArrayInputStream(tempOutputStream .toByteArray());
            }
        } catch (Exception e){
            LOG.error(e.getMessage());
        }finally {
            s3Object.close();
        }

How can I solve this?

James Z
  • 12,209
  • 10
  • 24
  • 44
Babu
  • 440
  • 5
  • 23
  • To avoid a large memory footprint try copying to a temporary file not `ByteArrayOutputStream` – DuncG Mar 25 '23 at 10:18

1 Answers1

0

It would be best if we could work with a heap dump.

You can (and probably should) work with the initial InputStream. The code does not make much sense to me - you open an InputStream, copy it to an OutputStream, and then copy again to ByteArrayInputStream, where the entire content of the stream is loaded in memory. You end up with 2x200MB at least used just by 2 arrays. 400MB of memory is huge, even more in your case where that's almost 20% of all memory available.

Since tempOutputStream is temporary container to send the data in ByteArrayInputStream, you can create own class extending ByteArrayOutputStream and override toByteArray() not to copy the underlying array, it will probably help a little.

My advice would be to work with the initial input stream - S3ObjectInputStream inputStream = s3Object.getObjectContent(), unless you are absolutely certain you need to keep everything in memory (are you reading the stream multiple times?), in which case there is little you can do aside from increasing the memory of the container. Or you can write to a file on the file system, if possible in your scenario, and acquire new input stream when it's needed.

Chaosfire
  • 4,818
  • 4
  • 8
  • 23