2

I am trying to use the in-place MPI_Allreduce with the combination of MinGW-w64 gfortran (version 9.2 provided by MSYS64) and Microsoft MPI (version 10),

call MPI_Allreduce(MPI_IN_PLACE, srcdst, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierr)

The standard MPI_Allreduce (with distinct source and destination) works well, as does the in-place variant when I use C instead of Fortran.

The complete test program test_allreduce.f90 is

program test_allreduce

    use iso_fortran_env, only: real64
    use mpi

    implicit none

    integer, parameter :: mpiint = kind(MPI_COMM_WORLD)

    integer(mpiint) :: n = 10
    integer(mpiint) :: ierr1 = -1, ierr2 = -1, ierr3 = -1, ierr4 = -1

    real(real64) :: src(10) = (/ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 /)
    real(real64) :: dst(10) = 0

    call MPI_Init(ierr1)
    call MPI_Allreduce(src, dst, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierr2)
    call MPI_Allreduce(MPI_IN_PLACE, src, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierr3)
    call MPI_Finalize(ierr4)

    write (*, '(I4)') MPI_IN_PLACE
    write (*, '(4I4)') ierr1, ierr2, ierr3, ierr4
    write (*, '(10F4.0)') src
    write (*, '(10F4.0)') dst

end program

This is how I compile it:

set "PATH=C:\msys64\mingw64\bin;%PATH%"

x86_64-w64-mingw32-gfortran ^
    -fno-range-check ^
    "C:\Program Files (x86)\Microsoft SDKs\MPI\Include\mpi.f90" ^
    test_allreduce.f90 ^
    -I . ^
    -I "C:\Program Files (x86)\Microsoft SDKs\MPI\Include\x64" ^
    -o test_allreduce.exe ^
    C:\Windows\System32\msmpi.dll

And this is how I execute it (in single process only so far):

test_allreduce.exe

Currently, it prints

   0
0   0   0   0
0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
1.  2.  3.  4.  5.  6.  7.  8.  9. 10.

Apparently, the src buffer gets overwritten by garbage in the second (in-place) call to MPI_Allreduce.

I saw in the code of mpi.f90 Intel-specific DLLIMPORT directives and even attempted to add analogical

!GCC$ ATTRIBUTES DLLIMPORT :: MPI_IN_PLACE

without any effect.

jacob
  • 1,535
  • 1
  • 11
  • 26
  • Is the result any different if you use mpirun to launch the executable (as you should)? – Ian Bush Aug 17 '19 at 14:58
  • No, neither using `mpiexec -n 1 test_allreduce.exe` nor with larger process count makes any difference. – jacob Aug 17 '19 at 15:20
  • Works for me (gfortran7.4, open-mpi 2.1.1, Linux Mint 19) and I can't see anything that is likely to cause the problem in your code - though if it were me I would avoid using mpi_real8 and instead use something like MPI_Type_create_f90_real to get the handle for the real variables, but I will be amazed if this is the problem. – Ian Bush Aug 17 '19 at 15:27

1 Answers1

1

It turns out that the trouble is that in MSMPI the variable MPI_IN_PLACE is contained in an internal COMMON block /MPIPRIV1/ and it is a known bug in gfortran that the compiler fails to properly import COMMON block variables from DLLs.

Nevertheless, broken things can be fixed, and in the end all that was needed was to apply a patch to gfortran code and compile it from scratch in MSYS2 (phew...), and add the directive

!GCC$ ATTRIBUTES DLLIMPORT ::  MPI_BOTTOM, MPI_IN_PLACE

right after implicit none in the above presented code. (Both these variables seem to be needed in the directive, because MPI_IN_PLACE is second in the internal COMMON block just after MPI_BOTTOM.) Then the in-place MPI_Allreduce works flawlessly.

jacob
  • 1,535
  • 1
  • 11
  • 26