-1

I need to create a large image (w <= 20k, h <= 100k) I divide it into fragments (w = 20k, h = 5k) and store them in the database in a loop. But I still get an out of memory error. in the heap 1.5 gigabytes. Why is there a memory leak? According to my plan, only the current fragment, which weighs about 300 megabytes, should be stored in the heap.

@Transactional
public long createNewChart (int width, int height) {
    Chartographer chartographer = new Chartographer(width, height);
    chartRepo.save(chartographer);

    int number = 0;
    while (height >= 5000) {
        BufferedImage fragment = new BufferedImage(width, 5000, BufferedImage.TYPE_INT_RGB);
        fragmentRepo.save(new Fragment(imageOperator.imageAsBytes(fragment), number, chartographer));
        height -= 5000;
        number++;
    }
    if (height > 0) {
        BufferedImage fragment = new BufferedImage(width, height % 5000, BufferedImage.TYPE_INT_RGB);
        fragmentRepo.save(new Fragment(imageOperator.imageAsBytes(fragment), number, chartographer));
    }
    return chartographer.getCharId();
}

public byte[] imageAsBytes (BufferedImage image) {
    try(ByteArrayOutputStream stream = new ByteArrayOutputStream()) {
        ImageIO.write(image, "bmp", stream);
        return stream.toByteArray();
    } catch (IOException ex) {
        throw new RuntimeException(ex);
    }
}

enter image description here

Tidus213
  • 1
  • 4
  • Are you using a cached ORM framework like JPA? If yes, check whether the Fragment object is cached. – HUTUTU Feb 11 '22 at 10:30
  • Changing from `BufferedImage.TYPE_INT_RGB` to `TYPE_3BYTE_BGR` should shave off about 1/4 th of the `BufferedImage`s. Also, if you can write directly to a stream from the database (or even disk), instead of the `ByteArrayOuputStream`/`byte[]` solution, that would probably save you a lot of memory. – Harald K Feb 11 '22 at 17:58

2 Answers2

3

IMHO you are underestimating your heap space requirement.

  • For 20_000 by 5_000 pixels the BufferedImage needs at least 400 MB (100_000_000 pixels at 4 bytes per pixel)
  • ByteArrayOutputStream starts with a buffer size of 32 bytes and increases the buffer size every time when there is not enough room for writing data, at least doubling the size of the buffer every time
    • in my test the final buffer size in the ByteArrayOutputStream was 536_870_912
    • because that was size was probably reached by doubling the previous size, it temporarily needed 1.5 times that memory
  • the final byte array another 300_000_054 bytes to store the final byte array.

The critical point in memory consumption is the line return stream.toByteArray();:

  • the BufferedImage cannot be garbage collected (400MB)
  • the ByteArrayOutputStream contains a buffer of about 540MB
  • the memory for the final byte array needs to be allocated (another 300MB)
  • giving a total needed memory at that specific point of 1240MB for the image processing alone (not taking into account all the memory that the rest of your application consumes)

You can somewhat reduce the needed memory by presizing the ByteArrayOutputStream (by about 240MB):

public byte[] imageAsBytes (BufferedImage image) {
    int imageSize = image.getWidth()*image.getHeight()*3 + 54; // 54 bytes for the BMP header
    try (ByteArrayOutputStream stream = new ByteArrayOutputStream(imageSize)) {
        ImageIO.write(image, "bmp", stream);
        return stream.toByteArray();
    } catch (IOException ex) {
        throw new RuntimeException(ex);
    }
}

Another problem:

The method createNewChart() is marked as @Transactional. I don't know about your database, the technology that you use for database access and transaction management (JPA?) or how the pictures are effectively stored in the database (BLOB? base64 encoded?).

Most probably the whole database stack (and in this term I include the persistence framework like JPA too) keeps all the pictures in memory until the whole transaction is commited.

To verify this assumption you could remove the @Transactional attribute from createNewChart() so that the chart and the fragments are stored into the database in different transactions.


So it seems that something in your application holds references to memory when you expect them to be freed. Diagnosing this just from the source code alone is (IMHO) impossible - it could be something that you have written, it could be something in a library that you use.

To detect what keeps those references you should as a first step create a heap dump when the OutOfMemoryError is raised. You can do this by adding -XX:+HeapDumpOnOutOfMemoryError to the JVM options like this:

java -XX:+HeapDumpOnOutOfMemoryError -Xmx3G ..remaining start options..

This will produce a java_pid<somenumber>.hprof file when the OutOfMemoryError occurs.

The next step is then analyzing this file to find out what is referencing the memory. What works for me is the Eclipse Memory Analyzer (it is maintained by the Eclipse Foundation but you can use it for all heap dumps). Loading the previously generated heap dump into this tool gives a good overview into what is still referencing the images.

The next step would then be to find out why those references are still there.

Thomas Kläger
  • 17,754
  • 3
  • 23
  • 34
  • Ok, thanks, I couldn't figure out why it doesn't work already for 20000x10000. But even so, one and a half gigabytes should be enough for one iteration of the loop, and it works fine for a 20000x5000 image. That is, I still don't understand why the garbage collector doesn't clean up the memory at the start of the next cycle when references are rewritten. – Tidus213 Feb 11 '22 at 11:17
  • 1
    To allocate an array of 536MB the JVM needs not only at least 536MB of free memory - the memory must also be available in one single chunk. It can be that during the first iteration a chunk of this size is available (because the JVM could just request additional memory from the OS) but in a later iteration no single chunk of this size is available. – Thomas Kläger Feb 11 '22 at 11:39
  • In this case, can you advise what is the best way to solve this problem? – Tidus213 Feb 11 '22 at 11:47
  • 1
    I see 3 possibilities: (1) trying to presize the `ByteArrayOutputStream` (on my machine this was enough to make it work with a 1.5GB heap), (2) reduce the height of the stripes (for example to 4000 pixels), (3) give the JVM more heap space (maybe 2 GB?) – Thomas Kläger Feb 11 '22 at 11:58
  • I tried to make 3 gigabytes, still error, now after creating 3 fragments. There is still a problem with the garbage collector i think – Tidus213 Feb 11 '22 at 12:16
  • Or it could be the persistence layer that keeps everything in memory (because `createNewChart()` is marked as `@Transactional`), see my updated answer. – Thomas Kläger Feb 11 '22 at 12:59
  • No =( I am using postgres (added db structure to post). I tried instead of saving to the database to save fragments as files. Everything worked well, saving 20 pictures 20000x5000, each weighing 286 MB. That is a problem because of work with DB. But I don't know why... – Tidus213 Feb 11 '22 at 14:06
  • @Tidus213 I've added hints how you can further analyse the problem because IMHO just looking at the source code is no longer enough - you need data from the running process to find out why the images are still referenced. – Thomas Kläger Feb 12 '22 at 09:15
0

If I am not mistaking, according to your code, you are creating a new byte[] array with width, height % 5000 for every save. So the memory is allocated either way, and the garbage collector did not have a chance to remove the older byte[] arrays before out of memory.

You could try to create a local variable byte[] array and attempt to stream.toByteArray() in it, before doing fragementRepo.save() that way java uses the same memory reference and overwrites.

JCompetence
  • 6,997
  • 3
  • 19
  • 26
  • Why the garbage collector did not have a chance to remove the older byte[]? – Tidus213 Feb 11 '22 at 10:00
  • https://stackoverflow.com/questions/1582209/java-garbage-collector-when-does-it-collect/47117385 – JCompetence Feb 11 '22 at 10:05
  • I think it might have something to do with the ORM framework cache. A local byte[] is only a reference. stream.toByteArray() always returns different objects for different images. – HUTUTU Feb 11 '22 at 10:36