How do I prevent my parallel code using up all the available system memory?

Question

I'm developing a c++ code on Linux which can run out of memory, go into swap and slow down significantly, and sometimes crash. I'd like to prevent this from happening by allowing the user to specify a limit on the proportion of the total system memory that the process can use up. If the program should exceed this limit, then the code could output some intermediate results, and terminate cleanly.

I can determine how much memory is being used by reading the resident set size from /proc/self/stat. I can then sum this up across all the parallel processes to give me a total memory usage for the program.

The total system memory available can be obtained via a call to sysconf(_SC_PHYS_PAGES) (see How to get available memory C++/g++?). However, if I'm running on a parallel cluster, then presumably this figure will only give me the total memory for the current cluster node. I may, for example, be running 48 processes across 4 cluster nodes (each with 12 cores).

So, my real question is how do I find out which processor a given process is running on? I could then sum up the memory used by processes running on the same cluster node, and compare this with the available memory on that node, and terminate the program if this exceeds a specified percentage on any of the nodes that the program is running on. I would use sched_getcpu() for this, but unfortunately I'm compiling and running on a system with version 2.5 of glibc, and sched_getcpu() was only introduced in glibc 2.6. Also, since the cluster is using on old linux OS (version 2.6.18), I can't use syscall() to call getcpu() either! Is there any other way to get the processor number, or any sort of identifier for the processor, so that I can sum memory used across each processor separately?

Or is there a better way to solve the problem? I'm open to suggestions.

Did you try to use system limits (http://linux.die.net/man/2/setrlimit)? You would have to set limits on every node separately, but that shouldn't be a problem. — gawi, Jul 14 '13 at 11:50
@gawi I'm not familiar with getrlimit/setrlimit, so no I hadn't tried it. Thanks for pointing that out. I guess that would allow me to detect whether any given process is getting close to the limit set by the kernel, but is there not still the possibility that no one process is approaching the limit, but that collectively the memory used by all the processes is too high? — GentleEarwig, Jul 15 '13 at 16:24
Yes this method allows you to set soft limits for resources for current process and all child processes. However if it's not usable to you, let me know what you use to distribute your computations? MPI? And why do you have to manually check the resource usage? As markhahn mentioned - it's not responsibility of the user but schedule system. — gawi, Jul 16 '13 at 09:14
@gawi Yes, I'm using MPI for parallel communication. I don't really want to mess with the resource allocation - as both you and markhahn mention, this is best left to the schedule system. I just want to know if I am approaching the memory limits, so that I can do something about it in my code. At the very least output an intermediate result for example. Perhaps I didn't phrase my question properly. — GentleEarwig, Jul 16 '13 at 18:50
I still don't understand why... If you have memory limit set on the node, then you don't need to do additional checks - when you reach the limit you will get NULL during request for additional memory and can handle it by yourself as you like. Anyway, if this helps you, try to use MPI_Get_processor_name (http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Get_processor_name.html) to determine node on which your piece of code is working. — gawi, Jul 16 '13 at 21:44
@gawi Thanks - this looks like what I was after. I have added a catch for bad_alloc in my code, but this will only happen after it has run out of swap. I was trying to detect that the program was going into swap and be able to react to this in my program. — GentleEarwig, Jul 24 '13 at 10:20

score 0 · Accepted Answer · answered Jul 14 '13 at 23:40

A competently-run cluster will put your jobs under some form of resource limits (RLIMIT_AS or cgroups). You can do this yourself just by calling setrlimit(RLIMIT_AS,...). I think you're overcomplicating things by worrying about sysconf, since on a shared cluster, there's no reason to think that your code should be using even a fixed fraction of the physical memory size. Instead, you should choose a sensible memory requirement (if your cluster doesn't already provide one - most schedulers do memory scheduling reasonably well.) Even if you insist on doing it yourself, auto-sizing, you don't need to know about which cores are in use: just figure out how many copies of your process are on the node, and divide appropriately. (you will need to figure out which node (host) each process is running on, of course.)

It's worth pointing out that the kernel RLIMIT_AS, not RLIMIT_RSS. And that when you hit this limit, new allocations will fail.

Finally, I question the design of a program which uses unbounded memory. Are you sure there's not a better algorithm? Users are going to find your program pretty irritating to use if, after investing significant time in a computation, it decides it's going to try allocating more, then fails. Sometimes people ask this sort of question when they are mistakenly thinking that allocating as much memory as possible will give them better IO buffering (which is naive wrt pagecache, etc).

Yes, as you point out, I need to figure out which node (host) each process is running on - this is what I'm stuck on! Any suggestions? I am not trying to allocate a chunk of memory for my program to run in. I can't go into too much detail on my algorithm I'm afraid, but I'm essentially refining a triangular surface mesh based on certain criteria. However, I don't know apriori how fine the surface needs to be. I've implemented a limit on the number of faces, but I wanted to do something more intelligent, based on the available hardware resources. Anyway, thanks for your help. — GentleEarwig, Jul 15 '13 at 16:31

How do I prevent my parallel code using up all the available system memory?

1 Answers1