0

I want to extend this example by Jonathan Dursi for unequal 2D array using MPI_Scatterv and MPI_Gatherv. Basically, I have 4 local processes, each holds an array with different sizes:

Process = 0

|00000| 
|00000| 
|00000|

Process = 1

|1111| 
|1111| 
|1111|

Process = 2

|22222|
|22222|

Process = 3

|33333| 
|33333|

I gathered them to the master process, and I expected to get:

Master = 0

|000001111|
|000001111|
|000001111|
|222223333|
|222223333|

However, below is what I got from my code.

Master process

|000001111|
|100000111|
|110000011|
|222223333|
|322222333|

I think there was something wrong with my MPI_Type_vector. Any suggestions to fix this.

Below is my code:

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <mpi.h>

int main(int argc, char **argv)
{
    int rank, size;        // rank of current process and no. of processes
    int domain_x, domain_y;
    int global_x, global_y;
    int topx = 2, topy = 2;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    if (rank == 0)
    {
        domain_x = 5;
        domain_y = 3;
    }

    if (rank == 1)
    {
        domain_x = 4;
        domain_y = 3;
    }

    if (rank == 2)
    {
        domain_x = 5;
        domain_y = 2;
    }

    if (rank == 3)
    {
        domain_x = 4;
        domain_y = 2;
    }

    global_x = 9;
    global_y = 5;

    int *glob_data = new int[global_x*global_y];

    // Initialize local data   
    int *local_data = new int[domain_x*domain_y]; 
    for (int i = 0; i < domain_x; ++i)
    {
        for (int j = 0; j < domain_y; ++j)
        {
            local_data[j*domain_x+i] = rank;
        }
    }

    for (int p=0; p<size; p++) {
        if (p == rank) {
            printf("rank = %d\n", rank);
            for (int j=0; j<domain_y; j++) {
                for (int i=0; i<domain_x; i++) {
                    printf("%3d ",(int)local_data[j*domain_x+i]);
                }
                printf("\n");
            }
            printf("\n");
        }
        MPI_Barrier(MPI_COMM_WORLD);
    }


    MPI_Datatype blocktype, blocktype2;
    MPI_Type_vector(domain_y, domain_x, topx*domain_x, MPI_INT, &blocktype2);
    MPI_Type_create_resized(blocktype2, 0, sizeof(int), &blocktype);
    MPI_Type_commit(&blocktype);

    int *displs = new int[topx*topy];
    int *counts = new int[topx*topy];
    for (int j=0; j<topy; j++) {
        for (int i=0; i<topx; i++) {
            displs[j*topx+i] = j*global_x*domain_y + i*domain_x;
            counts [j*topx+i] = 1;
        }
    }

    MPI_Gatherv(local_data, domain_x*domain_y, MPI_INT, glob_data, counts, displs, blocktype, 0, MPI_COMM_WORLD);

    if (rank == 0)
    {
        printf("Master process = %d\n", rank);
        for (int j=0; j<global_y; j++) {
            for (int i=0; i<global_x; i++) {
                printf("%d  ", glob_data[j*global_x+i]);
            }
            printf("\n");
        }
    }

    MPI_Type_free(&blocktype);
    MPI_Type_free(&blocktype2);
    MPI_Finalize();

    return 0;
}
Community
  • 1
  • 1
PLe
  • 37
  • 7
  • Note: this Q was linked from the very answer you referred in your post. While this is technically a duplicate, a few more remarks about your code: 1) Your third argument to `MPI_Type_vector` is wrong, it should be `global_x`. 2) The *signatures* of the types in the collective must match. Only the displacements may differ. 3) The MPI correctness checking tool [MUST](https://doc.itc.rwth-aachen.de/display/CCP/Project+MUST) can help you debug such issues. It will give you a helpful and specific error message in your case. – Zulan Jun 07 '16 at 07:22
  • The blocks your receive overlap because the receiver uses the same datatype for all four blocks. MUST could really help in detecting such cases: it issues errors such as _"Two collective calls use (datatype,count) pairs that span type signatures of different length!"_ and produces graphs such as https://i.imgur.com/YpZ9D3s.png (both are from your program) – Hristo Iliev Jun 07 '16 at 12:09
  • @Zulan. I agree that the third argument in `MPI_Type_vector` should be `global_x`. But it doesn't solve problem. Hristo is right, the same `blocktype` from rank 0 were used and it caused the problem. MUST will be helpful in this case. Thank you. – PLe Jun 07 '16 at 15:23
  • @Hristo: Nice graph. I will check it out to find a solution. – PLe Jun 07 '16 at 15:29
  • @PLe, did you look at the duplicate? MUST shows you the problem, but the solution is complicated... – Zulan Jun 07 '16 at 16:10
  • @Zulan: I looked at it, very similar to my problem. Yeap, the solution is much more complicated than the uniform case. Will play with it to see if I can simplify a little bit. Thanks. – PLe Jun 08 '16 at 03:36

0 Answers0