Inconsistent rows allocation in scalapack

Question

Consider the following simple fortran program

program test_vec_allocation
    use mpi
    implicit none
    integer(kind=8)             :: N
    ! =========================BLACS and MPI=======================
    integer                     :: ierr, size, rank,dims(2)
    ! -------------------------------------------------------------
    integer, parameter          :: block_size = 100
    integer                     :: context, nprow, npcol, local_nprow, local_npcol
    integer                     :: numroc, indxl2g, descmat(9),descvec(9)
    integer                     :: mloc_mat ,nloc_mat ,mloc_vec ,nloc_vec

    call blacs_pinfo(rank,size)
    dims=0
    call MPI_Dims_create(size, 2, dims, ierr)
    nprow = dims(1);npcol = dims(2)
    call blacs_get(0,0,context)
    call blacs_gridinit(context, 'R', nprow, npcol)
    call blacs_gridinfo(context, nprow, npcol, local_nprow,local_npcol)

    N = 700

    mloc_vec = numroc(N,block_size,local_nprow,0, nprow)
    nloc_vec = numroc(1,block_size,local_npcol,0, npcol)
    print *,"Rank", rank, mloc_vec, nloc_vec

    call blacs_gridexit(context)
    call blacs_exit(0)

end program test_vec_allocation

when I run it with 11 mpi ranks i get

 Rank           0         100           1
 Rank           4         100           1
 Rank           2         100           1
 Rank           1         100           1
 Rank           3         100           1
 Rank          10           0           1
 Rank           6         100           1
 Rank           5         100           1
 Rank           9           0           1
 Rank           8           0           1
 Rank           7           0           1

which is how i would expect scalapack to divide this array, however, for even number of ranks i get:

 Rank           0         200           1
 Rank           8         200           0
 Rank           9         100           1
 Rank          10         100           0
 Rank           1         200           0
 Rank           6         200           1
 Rank          11         100           0
 Rank           3         200           1
 Rank           4         200           0
 Rank           2         200           0
 Rank           7         200           0
 Rank           5         200           0

which makes no sense, why would rank 0 get 200 elements for block size 100 and ranks * block size > N. Because of this my program works for mpi ranks 1,2,3,5,7,11, but fails for ranks 4,6,8,9,10,12, etc (I dont why it is failing for rank 9!). Can anyone explain what is wrong in my approach?

GFortran version: 6.1.0

SCALPACK version: 2.1.0

MacOS version: 10.11

score 2 · Accepted Answer · answered May 12 '20 at 08:48

There are a number of things wrong with your code

1) Firstly don't use Integer( 8 ). As Vladimir put it, please unlearn this. Not only is it not portable and therefore very bad practice (please see many examples here, e.g. Fortran 90 kind parameter) here it is wrong as numroc expects an integer of default kind as its first argument (see e.g. https://software.intel.com/content/www/us/en/develop/documentation/mkl-developer-reference-fortran/top/scalapack-routines/scalapack-utility-functions-and-routines/numroc.html)

2) You call an MPI routine before you call MPI_Init, with a hand full of exceptions (and this isn't one) this results in undefined behaviour. Note the description at https://www.netlib.org/blacs/BLACS/QRef.html#BLACS_PINFO makes no reference to actually calling MPI_Init. As such I also prefer to call MPI_Finalise

3) You have misunderstood MPI_Dims_create. You seem to assume you will get a 1 dimensional distribution, but you actually ask it for a two dimensional one. Quoting from the standard at https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf

The entries in the array dims are set to describe a Cartesian grid with ndims dimensions and a total of nnodes nodes. The dimensions are set to be as close to each other as possible,using an appropriate divisibility algorithm. The caller may further constrain the operation of this routine by specifying elements of array dims. If dims[i] is set to a positive number,the routine will not modify the number of nodes in dimension i; only those entries where dims[i] = 0 are modified by the call.

You set dims equal to zero, so the routine is free to set both dimensions. Thus for 11 processes you will get a 1x11 or 11x1 grid, which is what you seem to expect. However for 12 processes, as The dimensions are set to be as close to each other as possible you will get either a 3x4 or 4x3 grid, NOT 12x1. If it is 3x4 along each row you expect numroc to return 3 processes with 200 elements ( 2 blocks ), and 1 with 100. As there are 3 rows you therefore expect 3x3=9 processes returning 200 and 3x1=3 returning 100. This is what you see. Also try 15 procs - you will see an odd number of processes that according to you "does not work", this is because (advanced maths alert) 15=3x5. Incidentally on my machine 9 processes does NOT return 3x3 - this looks like a bug in openmpi to me.

Thank you for your insight, I use MPI_Dims_create to get nprow and npcol for blacs_gridinit. Can you point to a source where I can achieve the same funcitonality in correct way? Also this paves way for a second question for me, I want to do a matrix vector multiplication followed by calculating norm and deciding on it, problem is that those nodes with no vector elements assigned have norm as 0.0000 hence fails the test. Shall I modify current question or ask another? would you be so kind enough to provide opinion on it too? — ipcamit, May 12 '20 at 10:11
`call pdnrm2(N,norm,tmp_vec,1,1,descvec,1); if (norm < tol) exit; call pdgemv('N', N, N, 1.0, A, 1, 1, descmat, s, 1, 1, descvec, 1, 0.0,t, 1, 1, descvec, 1); ` basically in above call to pdnrm2 assign 0.000 to non participating processes in other rows. However in case of 11 ranks etc, norm is assigned correctly to even non participating nodes — ipcamit, May 12 '20 at 10:13
If you have a separate question please put it in a separate question — Ian Bush, May 12 '20 at 10:26
Why do you need to call mpi_dims_create if you want a 1*nproc grid? All mpi_dims_create does is find appropriate factors for the numbers of processes, and if you know you have 2 factors, and you know you want one of them to be 1 isn't the other easy to work out? — Ian Bush, May 12 '20 at 10:28
Ok. I have raised a separate question detailing my problem here: https://stackoverflow.com/questions/61749736/efficient-way-to-find-norm-of-distributed-vector-in-scalapack — ipcamit, May 12 '20 at 10:47
I am not trying to allocate 1 dim proc array, I just wanted to know best way to arrange rows and columns of procs before calling gridint function. Essentially I would like to get the nprow and npcol of scalapck grid before the grid is initialized so i can take few decisions based on it. — ipcamit, May 12 '20 at 10:50

Inconsistent rows allocation in scalapack

1 Answers1

Linked