For conducting accurate benchmarks, it is imperative that the external influences are suppressed as much as possible. If your system has enough CPU cores, you can isolate some of them using kernel parameters and thus prevent any other process and/or kernel tasks from using those cores:
... isolcpus=3,4,5 nohz_full=3,4,5 rcu_nocbs=3,4,5 ...
Those parameters will almost completely isolate CPUs 3, 4, and 5 by preventing the OS scheduler from running processes on them by default (isolcpus
), the kernel RCU system from running tasks of them (rcu_nocbs
), and prevent the periodic scheduler timer ticks (nohz_full
). Make sure that you do not isolate all CPUs!
You can now explicitly assign a process to those cores using taskset -c 3-5 ...
or the mechanism built into the OpenMP runtime, e.g., export GOMP_CPU_AFFINITY="3,4,5"
for GCC. Note that, even if you do not use dedicated isolated CPUs, simply turning on thread pinning with export OMP_PROCBIND=true
or by setting GOMP_CPU_AFFINITY
(KMP_AFFINITY
for Intel) should decrease the run time divergence.