Strange result of MPI_AllReduce for 16 byte real

Question

Compiler: gfortran-4.8.5

MPI library: OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2)

This program:

  use mpi
  implicit none

  real*16 :: x
  integer :: ierr, irank, type16

  call MPI_Init(ierr)

  call MPI_Comm_Rank(MPI_Comm_World, irank, ierr)

  if (irank+1==1) x = 2.1
  if (irank+1==8) x = 2.8
  if (irank+1==7) x = 5.2
  if (irank+1==4) x = 6.7
  if (irank+1==6) x = 6.5
  if (irank+1==3) x = 5.7
  if (irank+1==2) x = 4.0
  if (irank+1==5) x = 6.8

  print '(a,i0,a,f3.1)', "rank+1: ",irank+1," x: ",x

  call MPI_AllReduce(MPI_IN_PLACE, x, 1, MPI_REAL16, MPI_MAX, MPI_Comm_World, ierr)

  if (irank==0) print '(i0,a,f3.1)', irank+1," max x: ", x

  call MPI_Finalize(ierr)
end

I also tried real(16), real(kind(1.q0)). real(real128) is actually equivalent with real*10 for this compiler.

The result is:

> mpif90 reduce16.f90 
> mpirun -n 8 ./a.out 
rank+1: 1 x: 2.1
rank+1: 2 x: 4.0
rank+1: 3 x: 5.7
rank+1: 4 x: 6.7
rank+1: 5 x: 6.8
rank+1: 6 x: 6.5
rank+1: 7 x: 5.2
rank+1: 8 x: 2.8
1 max x: 2.8

The program finds the true maximum for real*10 keeping MPI_REAL16. The MPI specification (3.1, pages 628 and 674) is not very clear if MPI_REAL16 corresponds to real*16 or real(real128) if these differ.

Also, assuming MPI_REAL16 is actually real(real128) and trying to use that in a program leads to a different problem:

Error: There is no specific subroutine for the generic 'mpi_recv' at (1)
Error: There is no specific subroutine for the generic 'mpi_send' at (1)

which does not happen for real*16. (disregarding that one should be able to pass any bit pattern, so this check is superfluous)

What is the right way to use 16 byte reals? Is the OpenMPI library in error?

You could try to use mpi_type_create_f90_real to get the same datatype as you would get from the selected_real_kind. Though some MPI library versions didn't resolve that properly, I would still say this is the most portable and advised approach. See http://www.open-mpi.org/~jsquyres/www.open-mpi.org/doc/v1.7/man3/MPI_Type_create_f90_real.3.php — haraldkl, Oct 13 '15 at 18:35
See also this ticket: https://github.com/open-mpi/ompi/issues/63 — haraldkl, Oct 13 '15 at 18:46
@francescalus I am not sure if mpi_f08 is supported for gfortran at all, given the non-standard array descriptor. I definitely don't have it now,but I could try to compile it myself. The exact values of the contants are not important, they are just copied and pasted from an actual computation. — Vladimir F Героям слава, Oct 13 '15 at 18:55
@haraldkl I am getting `MPI_ERR_ARG: invalid argument of some other kind` from `call MPI_TYPE_CREATE_F90_REAL(25, MPI_UNDEFINED, type16, ierr)` where `type16` and `ierr` are integers. — Vladimir F Героям слава, Oct 13 '15 at 19:07
Yes, can confirm. Also not working with MPICH, there it complains that the precision is not available. I'd guess this is an unresolved issue, and we are stuck with the status in the ticket linked above. — haraldkl, Oct 13 '15 at 19:34
Yes, the last message (on the top) in the github issue describes it well, it seems to use just the last value. — Vladimir F Героям слава, Oct 13 '15 at 19:40
I observe the same behavior with gfortran 5.2.0 and OpenMPI 1.10.0 — casey, Oct 14 '15 at 02:55
You should post that question to the ompi-devel mailing list. — Hristo Iliev, Oct 14 '15 at 11:27
I don't know if people in OpenMPI actually care about Fortran given the documentation still shows `include 'mpif.h'` before every subroutine call when MPI-3 already brought `use mpi_f08`. — Vladimir F Героям слава, Oct 14 '15 at 21:55
They do care. The library already implements the `use mpi_f08` interface in the release version and the `use mpi` and `use mpi_f08` syntaxes are already part of the man pages in the GitHub master. (Truth be told, the commit is from 2 days ago, but better late than never) — Hristo Iliev, Oct 14 '15 at 22:47
It would be good to follow the advice by HristoIliev, the developers are more inclined to look into issues if there are actually people showing interest in a given feature. If nobody shows interest, there is no motivation in working on improvements. As you can see in the trouble ticket @JeffSquyres has a repository dedicated to this issue: https://bitbucket.org/jsquyres/fortran-real16/, so even if you don't get feedback immediately, a raised awareness might help in the long run. — haraldkl, Oct 15 '15 at 05:22
The development of Open MPI is extremely pragmatically-driven. They seem to be short on contributors and tend to implement features or fixes that benefit the largest and most vocal parts of their user base only. And having the discussions move away from their mailing lists to places like Stack Overflow is not helping in that regard too. — Hristo Iliev, Oct 15 '15 at 08:24
This won't solve your problem, but shouldn't you write your long constants using `q` rather than `e`? Otherwise they will be truncated to single precision. I can also reproduce your problem on openmpi 1.10.0, with and without the `q`. — amaurea, Oct 18 '15 at 17:38
The exact numbers are irrelevant. Besides, the `q` is completely non-standard. The kind notation could be used, if I cared about those numbers. — Vladimir F Героям слава, Oct 18 '15 at 18:35

Jeff Hammond · Answer 1 · 2015-12-11T19:21:12.930

While this should just work correctly in every MPI implementation, a straightforward workaround is to implement a user-defined reduction for this type that is written in Fortran, so there are no issues with implementing it in C (this is how MPICH and OpenMPI try to do everything, hence there are issues when C cannot reproduce the behavior of Fortran).

Below is an attempt to implement this. This is the user-defined reduction in Fortran. I am certain that experienced modern Fortran programmers can do it better.

  subroutine sum_real16(iv,iov,n)
    implicit none
    integer, intent(in) ::  n
    real*16, intent(in) :: iv(:)
    real*16, intent(inout) :: iov(:)
    integer :: i
    do i = 1,n
      iov(i) = iov(i) + iv(i)
    enddo
  end subroutine sum_real16
  subroutine reduce_sum_real16(iv, iov, n, dt)
    use, intrinsic ::  iso_c_binding, only : c_ptr
    use mpi_f08
    implicit none
    type(c_ptr), value ::  iv, iov
    integer ::  n
    type(MPI_Datatype) ::  dt
    if ( dt .eq. MPI_REAL16 ) then
        call sum_real16(iv,iov,n)
    endif
  end subroutine reduce_sum_real16
  program test_reduce_sum_real16
    use, intrinsic ::  iso_c_binding
    use mpi_f08
    implicit none
    integer, parameter ::  n = 10
    real*16 :: output(n)
    real*16 :: input(n)
    real*16 :: error
    integer :: me, np
    procedure(MPI_User_function) :: reduce_sum_real16
    type(MPI_Op) :: mysum
    integer :: i
    call MPI_Init()
    call MPI_Comm_rank(MPI_COMM_WORLD,me)
    call MPI_Comm_size(MPI_COMM_WORLD,np)
    output = 0.0
    input  = 1.0*me
    call MPI_Op_create(reduce_sum_real16,.true.,mysum)
    call MPI_Allreduce(input,output,n,MPI_REAL16,mysum,MPI_COMM_WORLD)
    error = 0.0
    do i = 1,n
      error = error + (output(i)-1.0*np)
    enddo
    if (error.gt.0.0) then
        print*,'SAD PANDA = ',error
        call MPI_Abort(MPI_COMM_SELF,1)
    endif
    call MPI_Op_free(mysum)
    call MPI_Finalize()
  end program test_reduce_sum_real16

This program returns without error with Intel 16 Fortran compiler and MPICH 3.2+. Apparently I am not using I/O correctly though, so my confidence in the correctness of this program is not as high as it would be if I could write all the results to stdout.

Strange result of MPI_AllReduce for 16 byte real

1 Answers1

Linked