0

I am trying to use MPI_Gather to gather individual two-dimensional arrays into the master process for it to then print out the contents of the the entire matrix. I split the workload across num_processes processes and have each work on their own private matrix. I'll give a snippet of my code along with some pseudo-code to demonstrate what I'm doing. Note I created my own MPI_Datatype as I am transferring struct types.

Point **matrix = malloc(sizeof(Point *) * (num_rows/ num_processes));
for (i = 0; i < num_rows/num_processes; i++)
    matrix [i] = malloc(sizeof(Point) * num_cols);

for( i = 0; i< num_rows/num_processes; i++)
    for (j = 0; j < num_cols; j++)
        matrix[i][j] = *Point type*

if (processor == 0) {
    full_matrix = malloc(sizeof(Point *) * num_rows);
    for (i = 0; i < num_rows/num_processes; i++)
        matrix [i] = malloc(sizeof(Point) * num_cols);

    MPI_Gather(matrix, num_rows/num_processes*num_cols, MPI_POINT_TYPE, full_matrix, num_rows/num_processes*num_cols, MPI_POINT_TYPE, 0, MPI_COMM_WORLD);
} else {
    MPI_Gather(matrix, num_rows/num_processes*num_cols, MPI_POINT_TYPE, NULL, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD);
}
// ...print full_matrix...

The double for-loop prior to the gather computes the correct values as my own testing showed, but the gather onto full_matrix only contains the data from its own processes, i.e. the master process, as its printing later showed.

I'm having trouble figuring out why this is given the master process transfers the data correctly. Is the problem how I allocate memory for each process?

almater
  • 27
  • 4

1 Answers1

3

The problem is that MPI_Gather expects the contents of the buffer to be adjacent in memory, but calling malloc repeatedly doesn't guarantee that, as each invocation can return a pointer to an arbitrary memory position.

The solution is to store the Matrix in a whole chunk of memory, like so:

Point *matrix = malloc(sizeof(Point) * (num_rows / num_processes) * num_cols);

With this method you will have to access the data in the form matrix[i * N + j]. If you want to keep the current code, you can create the adjacent memory as before, and use another vector to store a pointer to each row:

Point *matrixAdj = malloc(sizeof(Point) * (num_rows / num_processes) * num_cols);
Point **matrix = malloc(sizeof(Point *) * num_rows);

for (int i = 0; i < num_rows; ++i) {
    matrix[i] = &matrixAdj[i * num_rows];
}
  • This has already been explained [here](https://stackoverflow.com/questions/5104847/mpi-bcast-a-dynamic-2d-array). Please, look through the old questions first before posting an answer. – Hristo Iliev Apr 24 '20 at 17:54