10

The displs argument from MPI_Scatterv() function is said to be an "integer array (of length group size). Entry i specifies the displacement (relative to sendbuf from which to take the outgoing data to process i". Let's say then that I have sendcounts argument

int sendcounts[7] = {3, 3, 3, 3, 4, 4, 4};

The way I'm reasoning this out is that the displs array should always start with value of 0 since the first entry's displacement is 0 relative to sendbuf, so in my example above, displs should look like:

int displs[7] = {0, 3, 6, 9, 13, 17, 21};

Is that correct? I know this is a trivial question, but for some reason the web does not help at all. There are no good examples out there, hence my question.

Morteza Jalambadani
  • 2,190
  • 6
  • 21
  • 35
bcsta
  • 1,963
  • 3
  • 22
  • 61

2 Answers2

6

Yes, displacements gives the root information the information as to which items to send to a particular task - the offset of the starting item. So in most simple cases (e.g., you'd use MPI_Scatter but the counts don't evenly divide) this can be immediately calculated from counts information:

displs[0] = 0;              // offsets into the global array
for (size_t i=1; i<comsize; i++)
    displs[i] = displs[i-1] + counts[i-1];

But it doesn't need to be that way; the only restriction is that the data you're sending can't overlap. You could count from the back just as well:

displs[0] = globalsize - counts[0];                 
for (size_t i=1; i<comsize; i++)
    displs[i] = displs[i-1] - counts[i];

or any arbitrary order would work as well.

And in general the calculations can be more complicated because the types of the send buffer and receive buffers have to be consistent but not necessarily the same - you often get this if you're sending multidimensional array slices, for instance.

As an example of the simple cases, the below does the forward and backward cases:

#include <iostream>
#include <vector>
#include "mpi.h"

int main(int argc, char **argv) {
    const int root = 0;             // the processor with the initial global data

    size_t globalsize;
    std::vector<char> global;       // only root has this

    const size_t localsize = 2;     // most ranks will have 2 items; one will have localsize+1
    char local[localsize+2];        // everyone has this
    int  mynum;                     // how many items 

    MPI_Init(&argc, &argv); 

    int comrank, comsize;
    MPI_Comm_rank(MPI_COMM_WORLD, &comrank);
    MPI_Comm_size(MPI_COMM_WORLD, &comsize);

    // initialize global vector
    if (comrank == root) {
        globalsize = comsize*localsize + 1;
        for (size_t i=0; i<globalsize; i++) 
            global.push_back('a'+i);
    }

    // initialize local
    for (size_t i=0; i<localsize+1; i++) 
        local[i] = '-';
    local[localsize+1] = '\0';

    int counts[comsize];        // how many pieces of data everyone has
    for (size_t i=0; i<comsize; i++)
        counts[i] = localsize;
    counts[comsize-1]++;

    mynum = counts[comrank];
    int displs[comsize];

    if (comrank == 0) 
        std::cout << "In forward order" << std::endl;

    displs[0] = 0;              // offsets into the global array
    for (size_t i=1; i<comsize; i++)
        displs[i] = displs[i-1] + counts[i-1];

    MPI_Scatterv(global.data(), counts, displs, MPI_CHAR, // For root: proc i gets counts[i] MPI_CHARAs from displs[i] 
                 local, mynum, MPI_CHAR,                  // I'm receiving mynum MPI_CHARs into local */
                 root, MPI_COMM_WORLD);                   // Task (root, MPI_COMM_WORLD) is the root

    local[mynum] = '\0';
    std::cout << comrank << " " << local << std::endl;

    std::cout.flush();
    if (comrank == 0) 
        std::cout << "In reverse order" << std::endl;

    displs[0] = globalsize - counts[0];                 
    for (size_t i=1; i<comsize; i++)
        displs[i] = displs[i-1] - counts[i];

    MPI_Scatterv(global.data(), counts, displs, MPI_CHAR, // For root: proc i gets counts[i] MPI_CHARAs from displs[i] 
                 local, mynum, MPI_CHAR,                  // I'm receiving mynum MPI_CHARs into local */
                 root, MPI_COMM_WORLD);                   // Task (root, MPI_COMM_WORLD) is the root

    local[mynum] = '\0';
    std::cout << comrank << " " << local << std::endl;

    MPI_Finalize();
}

Running gives:

In forward order
0 ab
1 cd
2 ef
3 ghi

In reverse order
0 hi
1 fg
2 de
3 abc
Jonathan Dursi
  • 50,107
  • 9
  • 127
  • 158
  • I am noticing that you are inputting values in `global` from the Root rank. after doing that won't you need to Broadcast `global` to all ranks? I am thinking this way because if `global` is then being used in `Scatterv` don't all ranks need to know the data inside `global`? – bcsta Dec 24 '16 at 22:43
  • 1
    The proper term for the matching datatypes is *congruent*. Consistent sounds strange to me, but then, I'm not a native English speaker. – Hristo Iliev Dec 25 '16 at 22:54
  • 1
    @user7331538, the first four arguments of `MPI_Scatterv` are only used by the root rank. The rest of the ranks receive the chunks in their receive buffers and they need not have access to the data in `global` (it comes from the root). – Hristo Iliev Dec 25 '16 at 22:56
  • oh yes of course that makes senes. Thanks a lot! I posted a related question about scatter problem a few days ago and i would really like some help and review on it. if you have the time I would appreciate if you can take a look @HristoIliev. here it is http://stackoverflow.com/questions/41317672/data-changes-after-passing-from-mpi-scatter – bcsta Dec 26 '16 at 08:47
1

Yes, your reasoning is correct - for contiguous data. The point of the displacements parameter in MPI_Scatterv is to also allow strided data, meaning that there are unused gaps of memory in the sendbuf between the chunks.

Here is an example for contigous data. The official documentation actually contains good examples for strided data.

Zulan
  • 21,896
  • 6
  • 49
  • 109