3

I'm trying to launch my mpi-application (Open MPI 1.4.5) with numactl. Since apparently the load balancing using --cpu-nodebind doesn't distribute my processes in a round-robbin manner among the available nodes I wanted to specifically restrict my processes to a closed set of cpus. In this way I plan to ensure a balanced load between the nodes in terms of the number of threads running on each node. --physcpubind seems to do the job according to the numactl manual.

The problem is - from what I could extract from this post - that, using --phycpubind, processes are allowed to migrate inside this cpu-set. Another problem is, that some cpus from this set remain unused while others are being assigned two or more processes and thus running with only 50% or less CPU usage. Why is this happening and is there any workaround for this phenomenon?

Kind regards

Community
  • 1
  • 1
el_tenedor
  • 644
  • 1
  • 8
  • 19
  • I'm not sure I understand what you are trying to achieve. If you have several processes, why not binding each of them to a given node? If you have one process, but several threads, you won't be able to specify something like a round robin policy using numactl. You would need to use the numa library and do that from code. (see http://linux.die.net/man/3/numa) – Alexandre de Champeaux Jun 10 '13 at 17:32
  • Open MPI supports processor and memory binding. See [this FAQ entry](http://www.open-mpi.de/faq/?category=tuning#using-paffinity-v1.4) for how to do it in your version of OMPI. – Hristo Iliev Jun 24 '13 at 22:06

1 Answers1

0

I think you can try this (It worked for me):

numactl --cpunodebind={cpu-core}  chrt -r 98 {your-app}

The chrt command lets you establish a scheduling policy, you can choose among the following:

Policy options:
 -b, --batch          set policy to SCHED_BATCH
 -d, --deadline       set policy to SCHED_DEADLINE
 -f, --fifo           set policy to SCHED_FIFO
 -i, --idle           set policy to SCHED_IDLE
 -o, --other          set policy to SCHED_OTHER
 -r, --rr             set policy to SCHED_RR (default)

EDIT: The number 98 is the priority, in my case I am running a time critical process. Also, you may need to isolate the cpus you are using to prevent the scheduler from assigning/moving processes to/from them.

Nicolasllk
  • 115
  • 1
  • 10