4

I'm having trouble sending and receiving columns of a 2-d array.

I have 2 processes. The first process has a 2-d array and I want to send parts of it to the second process. So say each rank has a 9x9 array, I'd like rank 0 to send to rank 1 just certain columns:

Example:

-1--2--3-
-2--3--4-
-5--6--7- ...

I want to send "1,2,5,..." and "3,4,7,...".

I've written code to just send the first column, and I've read through this answer and I believe I've correctly defined an MPI_Type_vector for the column:

MPI_Type_vector(dime,1,dime-1,MPI_INT,&LEFT_SIDE);

Where dime here, 9, is the size of the array; I'm sending 9 blocks of 1 MPI_INT, each separated by a stride of 8 - but even just sending this one column is giving me invalid results.

My code follows:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

#define dime 9

int main (int argc, char *argv[])
{
    int size,rank;
    const int ltag=2;

    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);       // Get the number of processes
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);       // Get the rank of the process

    int table[dime][dime];
    for (int i=0; i<dime; i++)
        for (int j=0; j<dime; j++)
            table[i][j] = rank;

    int message[dime];

    MPI_Datatype LEFT_SIDE;
    MPI_Type_vector(dime,1,dime-1,MPI_INT,&LEFT_SIDE);
    MPI_Type_commit(&LEFT_SIDE);

    if(rank==0) {
        MPI_Send(table, 1, LEFT_SIDE, 1, ltag, MPI_COMM_WORLD);
    } else if(rank==1){
        MPI_Status status;
        MPI_Recv(message, 1, LEFT_SIDE, 0, ltag, MPI_COMM_WORLD, &status);
    }

    if(rank == 1 ){
        printf("Rank 1's received data: ");

        for(int i=0;i<dime;i++)
            printf("%6d ",*(message+i));

        printf("\n");
    }

    MPI_Finalize();
    return 0;

}

But when I run it and look at the data I'm receiving, I get either all zeros or gibberish:

$ mpicc -o datatype datatype.c -Wall -g -O3 -std=c99 
$ mpirun -np 2 datatype
Rank 1's received data:      0  32710 64550200      0 1828366128  32765 11780096      0      0 

Where the numbers change each time. What am I doing wrong?

alfC
  • 14,261
  • 4
  • 67
  • 118
GomuGomuNoRocket
  • 771
  • 2
  • 11
  • 37
  • [This answer](http://stackoverflow.com/questions/10788180/sending-columns-of-a-matrix-using-mpi-scatter) talks at some length about selecting columns from a 2d matrix. – Jonathan Dursi Aug 13 '15 at 13:15
  • yeah i have see it..but i cant solve the problem..my code change the number those i dont want to keep to zero. – GomuGomuNoRocket Aug 13 '15 at 13:32
  • 1
    There's a lot of irrelevant stuff in the original post - cartesian topologies, non-blocking send/recvs, random number generation. I've tried to strip this problem to its essentials so that an answer is meaningful. Note that trimming a problem down to its essential elements is crucial to finding a solution. – Jonathan Dursi Aug 13 '15 at 16:51

2 Answers2

4

@Mort's answer is correct and was first; I just want to expand on it with some ASCII-art diagrams to try to drive home his messages.

An MPI Datatype describes how data is laid out in memory. Let's take a look at your 2d array for a smaller dime (say 4) and the corresponding MPI_Type_vector:

 MPI_Type_vector(count=dime, blocksize=1, stride=dime-1, type=MPI_INT ...
                      = 4             =1        = 3

 data = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,15 };
 Vector:  X  -  -  X  -  -  X  -  -  X -  -

Note that the stride in MPI Types are the distances between the starts of the types, not the gap size between them; so you actually want stride=dime here, not dime-1. That's easily fixed, but isn't the actual problem:

 MPI_Type_vector(count=dime, blocksize=1, stride=dime, type=MPI_INT ...
                      = 4             =1        = 4

 data = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,15 };
 Vector:  X  -  -  -  X  -  -  -  X  -  -  -  X -  -  - 

Ok, so far so good, we're selecting out the correct elements. But we're not receiving them properly; the code trying to receive the data into an array of size dime, using the same layout:

int message[dime];
MPI_Recv(message, 1, LEFT_SIDE, 0, ...

message = { 0, 1, 2, 3 };
Vector:     X  -  -  -  X  -  -  -  X  -  -  -  X -  -  - 

The vector goes well outside the range of message, which (a) leaves uninitialized data in message, which is the source of the gibberish, and (b) possibly causes segmentation errors for going outside the bounds of the array.

Crucially, one of these MPI_Type_vectors describes the layout of the desired data in the 2d matrix, but does not describe the layout of the same data as it's received into a compact 1d array.

There are two choices here. Either receive the data into the message array simply as dime x MPI_INT:

// ....
} else if(rank==1){
    MPI_Status status;
    MPI_Recv(message, dime, MPI_INT, 0, ltag, MPI_COMM_WORLD, &status);
}

//...

$ mpirun -np 2 datatype
Rank 1's received data:      0      0      0      0      0      0      0      0      0 

Or directly receive the data right into the 2d matrix on Rank 1, overwriting the appropriate columns:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

#define dime 9

int main (int argc, char *argv[])
{
    int size,rank;
    const int ltag=2;

    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);       // Get the number of processes
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);       // Get the rank of the process

    int table[dime][dime];
    for (int i=0; i<dime; i++)
        for (int j=0; j<dime; j++)
            table[i][j] = rank;

    MPI_Datatype LEFT_SIDE;
    MPI_Type_vector(dime,1,dime,MPI_INT,&LEFT_SIDE);
    MPI_Type_commit(&LEFT_SIDE);

    if(rank==0) {
        MPI_Send(table, 1, LEFT_SIDE, 1, ltag, MPI_COMM_WORLD);
    } else if(rank==1){
        MPI_Status status;
        MPI_Recv(table, 1, LEFT_SIDE, 0, ltag, MPI_COMM_WORLD, &status);
    }

    if(rank == 1 ){
        printf("Rank 1's new array:\n");

        for(int i=0;i<dime;i++) {
            for(int j=0;j<dime;j++) 
                printf("%6d ",table[i][j]);
            printf("\n");
        }

        printf("\n");
    }

    MPI_Type_free(&LEFT_SIDE);
    MPI_Finalize();
    return 0;

}

Running gives

$ mpicc -o datatype datatype.c -Wall -g -O3 -std=c99 
$ mpirun -np 2 datatype
Rank 1's new array:
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 
     0      1      1      1      1      1      1      1      1 

(after correcting the MPI_Type_vector)

The remaining bit about how to extend this to multiple columns is probably best left to another question.

Jonathan Dursi
  • 50,107
  • 9
  • 127
  • 158
  • Thanks for being so fair, Jonathan. – mort Aug 13 '15 at 17:49
  • @mort - Hey, you got it right first, I just wanted to add more words. Thanks for being too polite to point out that my updated version of the code still lacked the MPI_Type_free(). – Jonathan Dursi Aug 13 '15 at 18:30
2

I'm not quite sure what exactly your problem is (please make this clear in your questions, you'll get much better answers! See also How do I ask good questions.), but your code has several issues.

  • You need to useMPI_Type_vector(dime,1,dime,MPI_INT,&LEFT_SIDE);, since you are sending every dime-th element of the matrix. In C, a 2-d array is simply stored as a standard array, with element [i][j] being stored at index [i*dime+j]. You want to send the elements at indices 0, dime, 2*dime, 3*dime,...

  • If you use your LEFT_SIDE datatype to receive the data, MPI will store your data items with a gap of dime elements - analogously to the sender. However, your receive buffer message is a simple array. You need to receive the data like this: MPI_Recv(message, dime, MPI_INT, 0, LTAG, newcomm,&status);. This operation will receive dime integers and put them into your message array.

Edit: I updated my answer to match the significantly changed question.

Community
  • 1
  • 1
mort
  • 12,988
  • 14
  • 52
  • 97
  • the problem is that the message table write 1,0,3 and not 1,2,5.. replaces the number 2 with 0 in first line – GomuGomuNoRocket Aug 13 '15 at 12:23
  • And this is after you applied my suggestions? Can you update your question with a full example that highlights your issue? Also, please upvote my answer if it helped you and except it if it solved your problem. – mort Aug 13 '15 at 12:27
  • i update my post...if you see my message has only first line of table and replaces numbers with 0. – GomuGomuNoRocket Aug 13 '15 at 13:05