Basically I would like to run multiple MPI ranks on a single GPU (NVidia K20c), and I am aware of the existence of MPS and Kepler's Hyper-Q.
However, my question is, is Hyper-Q itself enough for my need? Or I have to use MPS? According to the above Hyper-Q link, "No extra coding effort is necessary to enable Hyper-Q. All it takes is a Tesla K20 GPU with a CUDA 5 installation and setting an environment variable to let multiple MPI ranks share the GPU – Hyper-Q is then ready to use."
Does this mean that I don't need MPS at all?
p.s., I am also aware of the following question on a similar topic, but it seems that doesn't answer my question clearly. Do I have to use the MPS (MULTI-PROCESS SERVICE) when using CUDA6.5 + MPI?
Thanks.