0

I'm trying to run the intel version of the HPL benchmark here and I'm a bit confused by the options.

What I want to do (for now) is a single-node run. The node has 2x Xeon Platinum 8276 processors, so 56 cores total. So my PxQ should be 56.

However the intel docs say:

  • MPI_PROC_NUM should be equal to PxQ (i.e 56) - this gets passed to mpirun -np
  • MPI_PER_NODE should be equal to the number of sockets in the system (i.e. 2) - this gets passed to mpirun -perhost

To me those don't seem consistent? And how does using OMP_NUM_THREADS fit into this?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
lost
  • 2,210
  • 2
  • 20
  • 34
  • 1
    It seems the doc suggests you run a hybrid MPI+OpenMP HPL. That means `OMP_NUM_THREADS=28` and `PxQ=2`. – Gilles Gouaillardet Oct 09 '20 at 14:08
  • Thank you. I was misled by expectations from the open-source HPL. In fact it turns out that using MKL it automatically picks the number of threads to add, using TBB rather than openmp, hence OMP_NUM_THREADs does nothing. Not obvious IMHO so might help someone! – lost Oct 13 '20 at 10:47

0 Answers0