Good morning!
I am looking for a LINUX
tool that can help to analyze bottlenecks caused by MPI
communication calls.
The code (C++
) is highly parallel and run on many computational nodes, connected by a fast network.
It doesn't use GPU
, only use CPU
for computations.
What I need is to find if/when it happens that other MPI
processes spend time waiting for information from other MPI
processes. This could be caused e.g. by different nodes hardware.
I am currently NOT trying to profile say single core code efficiency, I'm only interested in the bottlenecks caused by MPI communication calls. In other words, trying to analyze/improve performance scaling for large number of cores/nodes.
Thank you very much.