Using MPI_GATHERV in Fortran when gathering unequal sized arrays

Question

This question is similar to this one, with the complication that the size of the matrices being collected is not equal in row length, but they are equal in column length. (I have also looked at this question and this one but could not figure it out).

Background

I am performing a calculation for which I do not know the number of rows of the resulting matrix until the end of the calculation. Serially, I allocate a very large matrix which gets filled up and at the end of the calculation (when I know the limit of rows) I chop off the end of this large array and I am left with the result I want. Using MPI, I apply the same logic:

On each process I have one large array that gets filled up
At the end the calculation I need to chop off each array at its respective limit (limits are different on each process)
Then I need to gather the resulting arrays from each process into one that is then given to the root process in order to continue with the rest of the program.

Attempt so far

In trying to understand how MPI_GATHERV works and how to use it in my case I have edited the code given in the answer of this question in order to accept variable sizes of arrays from each process.

program main
use mpi
implicit none
integer :: ierr, myRank, nProcs
integer :: sendsubarray, recvsubarray, resizedrecvsubarray
integer, dimension(2) :: starts,sizes,subsizes
integer, dimension(:), allocatable :: counts, disps
integer, parameter :: nx_glb=20, ny_glb=5, ny=5
integer :: nx
integer, dimension(:), allocatable :: nx_all  
character, dimension(:,:), target, allocatable :: mat, matG
character :: c
integer :: i, j
integer(kind=mpi_address_kind) :: start, extent

call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, myRank, ierr)
call mpi_comm_size(mpi_comm_world, nProcs, ierr)
allocate(nx_all(nProcs))
nx_all = (/5, 4, 5, 5/)
nx = nx_all(myRank+1)  
sizes(1)=nx; sizes(2)=ny
subsizes(1)=nx; subsizes(2)=ny
starts(1)=0; starts(2)=0
call mpi_type_create_subarray(2, sizes, subsizes, starts,    mpi_order_fortran, &
                            mpi_character, sendsubarray, ierr)
call mpi_type_commit(sendsubarray,ierr)
allocate(mat(1:nx,1:ny))
mat='.'
forall (i=1:nx,j=1:ny) mat(i,j)=ACHAR(ICHAR('0')+myRank)

if(myRank.eq.0) then
 allocate(matG(nx_glb,ny_glb))
 matG='.'
 sizes(1)=nx_glb; sizes(2)=ny_glb
 subsizes(1)=nx; subsizes(2)=ny
 starts(1)=0; starts(2)=0
 call mpi_type_create_subarray(2, sizes, subsizes, starts, mpi_order_fortran, &
                               mpi_character, recvsubarray, ierr)
 call mpi_type_commit(recvsubarray, ierr)
 extent = sizeof(c)
 start = 0
 call mpi_type_create_resized(recvsubarray, start, extent, resizedrecvsubarray, ierr)
 call mpi_type_commit(resizedrecvsubarray,ierr)
end if

allocate(counts(4),disps(4))
counts(1:4) = (/1, 1, 1, 1/)
disps(1:4) = (/0, 5, 10, 15/)
call mpi_barrier(mpi_comm_world,ierr)
print *, mat, "process", myRank    
call mpi_gatherv(mat,1,sendsubarray,matG,counts,disps,resizedrecvsubarray, &
                 0,mpi_comm_world,ierr)
do p=0,nProcs
  if (myRank == p) then
     print *, 'Local array for rank ', myRank
     do i=1, nx
      print *, (mat(i,j),j=1,ny)
     end do
  endif
enddo
call MPI_Barrier(MPI_COMM_WORLD,ierr)
if(myRank.eq.0) then
 print * , matG, "process", myRank    
 print *, 'Global array: '
 do i=1, nx_glb
  print *, (matG(i,j),j=1, ny_glb)
 end do
end if

call mpi_finalize(ierr)

end program main

Desirable result (note how rank 1 has one row less):

Local array for rank 0
00000
00000
00000
00000
00000
Local array for rank 1
11111
11111
11111
11111
Local array for rank 2
22222
22222
22222
22222
22222
Local array for rank 3
33333
33333
33333
33333
33333

Global array: 
00000
00000
00000
00000
00000
11111
11111
11111
11111
22222
22222
22222
22222
22222
33333
33333
33333
33333
33333

Actual result (note how locally I have the expected behaviour but in the global matrix, the extra row of 1s is there and it has dots at the end):

Local array for rank 0
00000
00000
00000
00000
00000
Local array for rank 1
11111
11111
11111
11111
Local array for rank 2
22222
22222
22222
22222
22222
Local array for rank 3
33333
33333
33333
33333
33333

Global array: 
00000
00000
00000
00000
00000
1111.
1111.
1111.
1111.
1111.
22222
22222
22222
22222
22222
33333
33333
33333
33333
33333

I understand that in the memory, matrices are saved as arrays, so the global array that I get looks like this:

0000011111222223333300000111112222233333000001111122222333330000011111222223333300000.....2222233333

Question(s)

How do I remove the dots (which represent the empty row from rank 1)? How do I get it showing as a matrix with the correct number of rows?

Edit The reason that an extra row appears in the global array is because the recvsubarray that is created in the root process has dimensions 5x5, despite the fact that the sendsubarray from process 1 has dimensions 4x5. The problem now is, how do I define recvsubarray that has variable dimensions depending on which rank it is receiving information from?

score 1 · Answer 1 · answered Jun 30 '16 at 08:22

You've made your life very hard by defining the global matrix to be larger in the first dimension (nx) rather than the second (ny). The way that Fortran stores arrays, having a larger ny is much more natural as it corresponds to having all the sub-matrices stored in sequential order in memory.

If you are happy to swap nx and ny then you do not need to use any complicated derived types. In fact, I doubt you can do this pattern using Scatterv as that function requires a single receive type but you have different patterns for each incoming matrix (due to your choice of ordering for nx and ny).

This code, with swapped nx and ny, seems to work OK. The dotted line is at the end - I guess you'll always have some dots as you allocate more space than occupied by the submatrices.

program main
use mpi
implicit none
integer :: ierr, myRank, nProcs
integer :: sendsubarray, recvsubarray, resizedrecvsubarray
integer, dimension(2) :: starts,sizes,subsizes
integer, dimension(:), allocatable :: counts, disps
integer, parameter :: ny_glb=20, nx_glb=5, nx=5
integer :: ny
integer, dimension(:), allocatable :: ny_all
character, dimension(:,:), target, allocatable :: mat, matG
character :: c
integer :: i, j, p
integer(kind=mpi_address_kind) :: start, extent

call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, myRank, ierr)
call mpi_comm_size(mpi_comm_world, nProcs, ierr)
allocate(ny_all(nProcs))
ny_all = (/5, 4, 5, 5/)
ny = ny_all(myRank+1)
allocate(mat(1:nx,1:ny))
mat='.'
forall (i=1:nx,j=1:ny) mat(i,j)=ACHAR(ICHAR('0')+myRank)

if(myRank.eq.0) then
 allocate(matG(nx_glb,ny_glb))
 matG='.'
end if

allocate(counts(4),disps(4))
counts(:) = nx*ny_all(:)

disps(1)=0
do i = 2, 4
  disps(i) = disps(i-1)+counts(i-1)
end do

call mpi_barrier(mpi_comm_world,ierr)
print *, mat, "process", myRank
call mpi_gatherv(mat,nx*ny,MPI_CHARACTER,matG,counts,disps,MPI_CHARACTER, &
                 0,mpi_comm_world,ierr)
do p=0,nProcs
  if (myRank == p) then
     print *, 'Local array for rank ', myRank
     do i=1, nx
      print *, (mat(i,j),j=1,ny)
     end do
  endif
enddo
call MPI_Barrier(MPI_COMM_WORLD,ierr)
if(myRank.eq.0) then
 print * , matG, "process", myRank
 print *, 'Global array: '
 do i=1, nx_glb
  print *, (matG(i,j),j=1, ny_glb)
 end do
end if

call mpi_finalize(ierr)

end program main

Here's some of the output:

 Global array: 
 0000011112222233333.
 0000011112222233333.
 0000011112222233333.
 0000011112222233333.
 0000011112222233333.

00000000000000000000000001111111111111111111122222222222222222222222223333333333333333333333333.....

Hope this is useful.

good to know that fortran stores arrays based on the ny dimension. I couldn't upvote your answer because I don't have enough rep. — Mitsarien, Jul 11 '16 at 08:40

score 0 · Accepted Answer · answered Jul 11 '16 at 08:36

This is how I solved the problem above (the answer was given to me by a colleague, so all credit to him):

Since one of the dimensions of the final matrix is fixed, and since matrices are stored in arrays anyway, it is better if mpi_gather is used with mpi_type_vector instead of mpi_type_create_subarray. Therefore, the program structure goes as such: a) determine the size of interest of the submatrix in each rank, b) convert it to a vector, c) gather the vectors from each rank and d) reshape the vector into the final matrix. This way, there is no need to gather the unwanted information (depicted by the dots in the question above), and thus there is no need to get rid of them after using mpi_gather.

Thus, for gathering subarrays of different lengths but constant widths into a global matrix, the following code does the trick:

program main
  use mpi
  implicit none
  integer :: ierr, myRank, iProcs, nProcs, master
  integer :: ix, iy, ip
  integer :: nx, nxSum, offset, newtype
  integer, parameter :: ny=5
  integer, allocatable:: vec(:), vecG(:), nxAll(:), displs(:), rcounts(:), matG(:,:)

  call mpi_init(ierr)
  call mpi_comm_rank(mpi_comm_world, myRank, ierr)
  call mpi_comm_size(mpi_comm_world, nProcs, ierr)

  master = 0

  nx = myRank+1
  allocate(vec(nx*ny))
  do ix = 1,nx
     do iy = 1,ny
        ip = (ix-1)*ny + iy
        vec(ip) = myRank
     enddo
  enddo

  call mpi_barrier(mpi_comm_world,ierr)

  allocate(nxAll(nProcs))
  call mpi_gather(nx, 1, mpi_integer, nxAll, 1, mpi_integer, &
       master, mpi_comm_world, ierr)

  if (myRank == master) then
     ! print *, 'nxAll = ', nxAll, 'sum(nxAll) = ',sum(nxAll)
     nxSum = sum(nxAll)        
     allocate(vecG(nxSum*ny))

     allocate(displs(nProcs),rcounts(nProcs))
     offset = 0
     do iProcs = 1,nProcs
        displs(iProcs) = offset
        rcounts(iProcs) = nxAll(iProcs)*ny
        offset = offset + rcounts(iProcs)
        ! print *,'iProcs',iProcs,'displs = ',displs(iProcs),'rcounts',rcounts(iProcs)
     enddo

  endif

  call mpi_type_vector(nx*ny, 1, 1, mpi_integer,newtype,ierr)
  call mpi_type_commit(newtype,ierr)

  call mpi_gatherv(vec,1,newtype,vecG,rcounts,displs,mpi_integer, &
       master,mpi_comm_world,ierr)    


  if (myRank == master) then
     print *, 'Global vector, vecG = ',vecG

     ! Reshape into matrix
     print *, 'Global matrix'
     allocate(matG(nxSum,ny))
     do ix = 1,nxSum
        do iy = 1,ny
           ip = (ix-1)*ny + iy
           matG(ix,iy) = vecG(ip)
        enddo
        print *, (matG(ix,iy),iy=1,ny)
     enddo

  endif

  call mpi_finalize(ierr)

end program main

Using MPI_GATHERV in Fortran when gathering unequal sized arrays

2 Answers2