In this application I have groups of N (POSIX) threads. The first group starts up, creates an object A, and winds down. A little bit later a new group with N threads starts up, uses A to create a similar object B, and winds down. This pattern is repeated. The application is highly memory-intensive (A and B have a large number of malloc'ed arrays). I would like local access to memory as much as possible. I can use numactl --localalloc
to achieve this, but in order for this to work I also need to make sure that those threads from the first and second group that work on the same data are bound to the same NUMA node. I've looked into sched_setaffinity
, but wonder if better approaches exist.
The logic of the application is such that a solution where there are no separate thread groups would tear apart the program logic. That is, a solution where a single group of threads manages first object A and later object B (without winding down inbetween) would be extremely contrived and obliterate the object-oriented lay-out of the code.