0

I'm running some thousands of experiments, some of which can throw an OutOfMemoryError. The problem is that the program is stopped completely when one such an experiment throws this error. How could I make it so the program continues with the next experiment if such an error is thrown?

I'm thinking about catching the error, and forcing the garbage collector (such as talked about here Is it possible to catch out of memory exception in java?), is this a good idea?

J. Schmidt
  • 419
  • 1
  • 6
  • 17
  • No, it is not good idea. may you need to check if your system has resources to support thousands of experiments – sanjeevRm Jul 16 '21 at 08:35
  • @sanjeevRm I'm running it on a dedicated computations server, with 70 Gb max heap size already, and the code itself is close to optimal in terms of memory usage. – J. Schmidt Jul 16 '21 at 09:49

1 Answers1

0

Is this a good idea?

It is a bad idea.

In my answer to a different question, I explain some of the reasons why catching OutOfMemoryError may not allow the application to recover properly. (It depends on the nature of application and the real reason it ran out of memory.)


I'm thinking about catching the error, and forcing the garbage collector

There is no point "forcing" a garbage collection in your scenario. If one of your experiments fails with an OOME, you can be assured the GC has just run ... and been unable to find enough free memory to continue. Now, between the OOME being thrown and you catching it, you would how that some of the objects that were reachable via the experiments stack frames are now unreachable. The JVM will deal with that ... by running the GC itself.


I think that a better way to solve your problem is to make your application restartable. Have it keep a record (in a file!) of the experiments that complete and those that fail. When an OOME occurs, record this in the file. Then you add a "restart with the next experiment" feature to your application, and write a light-weight wrapper script to run the Java application repeatedly until it completes.

By restarting in a new JVM, you avoid having to deal with the damage that OOMEs can cause; e.g. when you have multiple threads. And you also have a "bandaid" for OOMEs that are caused by memory leaks. Finally, you may well find that the experiments run faster in a clean / empty heap.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • As stated, I wish to get out of the current experiment if it throws an OutOfMemoryError and continue with the next one. Your answer seems to imply that I wish to recover memory somewhere so as to continue the current experiment, which is not the case. I think your suggestion to add a wrapper script is do-able, but if possible I'd like to stay within Java environment. – J. Schmidt Jul 16 '21 at 14:28
  • 1
    *"Your answer seems to imply that I wish to recover memory somewhere so as to continue the current experiment"*. That's not what I mean. I am talking about catching OOMEs in general. It applies to all ways that you could try to recover ... including what you are proposing to do. – Stephen C Jul 16 '21 at 14:33