I am programming on a Knights Landing node which has 68 cores and 4 hyperthreads/core. I am working on a hybrid MPI/OpenMP application. My question is if the 4 hyperthreads are meant to be used as OpenMP threads or how could I use them? When I run my program using the following scheme:
export OMP_NUM_THREADS=1
mpirun -np 68 ./app
it runs much more faster than when I use the scheme:
export OMP_NUM_THREADS=4
mpirun -np 68 ./app
Maybe the problem is that the threads for a certain MPI are not close to each other. However, I don't know how to do it.
In summary, can I use the 4 hyperthreads/core as OpenMP threads?
Thanks.