1

How to find the available collective algorithm for Broadcast in Intel MPI.

In case of OpenMPI, we can list all the available MPI collective algorithm for Broadcast using

ompi_info --param coll tuned -l 9 | grep 'bcast algorithm

1. Binomial
2. Recursive doubling
3. Ring
4. Topology aware binomial
5. Topology aware recursive doubling
6. Topology aware ring
7. Shumilin's
8. Knomial
9. Topology aware SHM-based flat
10. Topology aware SHM-based Knomial
11. Topology aware SHM-based Knary
12. NUMA aware SHM-based (SSE4.2)
13. NUMA aware SHM-based (AVX2)
14. NUMA aware SHM-based (AVX512)

In case of Intel MPI, it only show the maximum number (range) of presets available for each collective operation.

impi_info -v I_MPI_ADJUST_BCAST

I_MPI_ADJUST_BCAST
  MPI Datatype:
    MPI_CHAR
  Description:
    Control selection of MPI_Bcast algorithm presets.
    Arguments
    <algid> - Algorithm identifier
    range: 0-18

Q) Is there any way to get all the algorithm for MPI_Bcast.(Even ompi_info --all does not show this information)


Output of "impi_info -v I_MPI_ADJUST_ALLREDUCE"

I_MPI_ADJUST_ALLREDUCE
  MPI Datatype:
MPI_CHAR
Description:
  Control selection of MPI_Allreduce algorithm presets.
  Arguments
  <algid> - Algorithm identifier
  range: 0-26

Ouput of "impi_info -v I_MPI_ADJUST_ALLREDUCE -all"

I_MPI_ADJUST_ALLREDUCE
  MPI Datatype:
  MPI_CHAR
Description:
  Control selection of MPI_Allreduce algorithm presets.
  Arguments
  <algid> - Algorithm identifier
  range: 0-26

Ouput of "impi_info -v I_MPI_ADJUST_ALLREDUCE -e"

I_MPI_ADJUST_ALLREDUCE
MPI Datatype:
  MPI_CHAR
Description:
  Control selection of MPI_Allreduce algorithm presets.
  Arguments
  <algid> - Algorithm identifier
  range: 0-26
Khalid Bin Huda
  • 1,583
  • 17
  • 16
  • OpenMPI is based on a modular component-based architecture (MCA). Multiple implementation of component are possible for the same type (as the provided list shows). That being said, not all MPI implementations are modular. In fact, AFAIK, OpenMPI is the only one to be so modular. Other MPI implementation can use one big algorithm composed of multiple parts corresponding to the different cases with a hand-written selection of the "right" part based on the context (eg. input size). THis is less flexible but possibly a bit faster. IDK, what IntelMPI actually does since it is close-source. – Jérôme Richard Dec 23 '22 at 04:17
  • 1
    @JérômeRichard Intel MPI may be closed source, but they still list all the algorithms that are available. See my answer. – Victor Eijkhout Dec 23 '22 at 04:44

1 Answers1

1

This page lists all the available Intel collectives variants:

https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/i-mpi-adjust-family-environment-variables.html

Victor Eijkhout
  • 5,088
  • 2
  • 22
  • 23
  • This link does not list all the available variants. for example "I_MPI_ADJUST_ALLREDUCE" has 28 variants but it only lists 12 of them. ('impi_info -v I_MPI_ADJUST_ALLREDUCE' command will list the total variants for allreduce). – Khalid Bin Huda Dec 23 '22 at 09:32
  • @KhalidBinHuda Can you add the output of `'impi_info -v I_MPI_ADJUST_ALLREDUCE` in your question? Does it changes something to add `-all` and `-e`? – Jérôme Richard Dec 23 '22 at 14:57
  • 1
    @JérômeRichard, I have added the outputs. The output of "impi_info -v I_MPI_ADJUST_ALLREDUCE" with -e or -all is the same. – Khalid Bin Huda Dec 23 '22 at 15:36