-1

I am trying to split up a 2 dimensional array's rows between "n" processes using MPI_Scatterv. The two arguments that are stopping me is "send_counts" and "displacements". I know the textbook definition of what these arrays do, but I need a way to dynamically create these arrays to accept any length 2 dimensional array, especially rows of a 2D array that are not evenly divisible by the number of processes.

The inspiration of this approach comes from here (building the send_counts and displacement array): https://gist.github.com/ehamberg/1263868 I understand the approach, but I wonder if this implementation only works for even 2D arrays (matrices).

questions: Could the problem have something to do with the 2D array not being contiguous?

Are the correct displacements concerning the memory blocks of the data type in question(ie should my displacements be 4 because floats are 4 bytes of memory?)

#include <iostream>
#include <fstream>
#include <sstream>
#include "mpi.h"
#include <stdio.h>


#define ROW 75 
#define COL 5

void importData(std::string str, float (*dest)[75][5], int length) {

std::ifstream infile(str);

int i = 0;
int j = 0;

std::string a;

while (getline(infile, a)) {

    std::stringstream ss(a);
    std::string token;
    i = 0;


    while (getline(ss, token, ',')) {

        if (i < length) {

            (*dest)[i][j] = strtof(token.c_str(), NULL);
        }

        else {

            i++;
        }

        j++;

    }


 }


}


int main(int argc, char **argv)
{

float iris[75][5] = { {} };

importData("Iris.test", &iris, 5);


int rank, comm_sz;

int sum = 0;

int rem = (ROW*COL) % comm_sz;

int * send_counts;
int * displs;


MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);


int row[1000];




send_counts = (int *)malloc(sizeof(float)*comm_sz);
displs = (int *)malloc(sizeof(float)*comm_sz);


// calculate send counts and displacements
for (int i = 0; i < comm_sz; i++) {
    send_counts[i] = (ROW*ROW) / comm_sz;
    if (rem > 0) {
        send_counts[i]++;
        rem--;
    }

    displs[i] = sum;
    sum += send_counts[i];
}



if (rank == 0){


}


// Scatter the big table to everybody's little table, scattering the rows
MPI_Scatterv(iris, send_counts, displs, MPI_FLOAT, row, 100, MPI_FLOAT, 0, 
MPI_COMM_WORLD);
//                              displacements      recv buffer, recv count
std::cout << "%d: " << rank << std::endl;

for (int i = 0; i < send_counts[rank]; i++) {
    std::cout << "%f\t" << row[i] << std::endl;
}

MPI_Finalize();



}

I expect each of "n" processes to print out a portion of the rows of the passed array.

This is the error I get:

An error occurred in MPI_Scatterv reported by process [2187067393,0] on communicator MPI_COMM_WORLD MPI_ERR_TRUNCATE: message truncated MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job)

**Note: the data file is 75 lines with 5 float numbers on each line, comma delimited

comfychair
  • 43
  • 6
  • A 2D array that you are sending via MPI should be implemented as a 1D array. Also that's not a [mcve], please reduce this to the actual problem (even with 0 values, it's VERY easy to write such things). – Matthieu Brucher Apr 03 '19 at 09:02

1 Answers1

1

The issue is you have non-matching signatures between what you send (e.g. sendcounts and sendtype) and what you receive (e.g. recvcount and recvtype).

In your case, since you receive (hard coded) 100 MPI_FLOAT and you send MPI_FLOAT, it can only work if all send_counts[i] == 100.

I guess the right fix is to change value of recvcount. On rank i, it should have the same value than send_counts[i] on the root rank (e.g. rank 0 in your case)

Gilles Gouaillardet
  • 8,193
  • 11
  • 24
  • 30
  • I can replace the hard coded 100 with the statement: send_counts[0]. Would that do it? – comfychair Apr 03 '19 at 01:09
  • that would only work on rank `0`. `send_counts[rank]` should do the trick. – Gilles Gouaillardet Apr 03 '19 at 01:19
  • That resulted in a segmentation fault (address not mapped to object). I check the contents of send_counts and I have ridiculous numbers in there (5626, 10972 etc.) It seems this alg for send_counts isn't working for me – comfychair Apr 03 '19 at 01:30
  • if you post a [MCVE], I will have a look – Gilles Gouaillardet Apr 03 '19 at 02:09
  • @GillesGouaillardet I'm curious - do you happen to know why type signatures in `Sacatter` & `Gather` are explicitly required to match even though the operations are described in terms of point-to-point operations which explicitly allow larger receive buffers than send counts? – Zulan Apr 03 '19 at 09:26
  • I think the rationale is to allow some optimizations by the MPI implementor. For example with `MPI_Scatter()`, one task might decide to issue 2 `MPI_Recv()` based on `recvcount&recvtype`, but the sender might decide to issue a single `MPI_Send()` based on non matching `sendcount&sendtype`. – Gilles Gouaillardet Apr 03 '19 at 10:02
  • 1
    I found a question that addresses my issue in part https://stackoverflow.com/questions/9269399/sending-blocks-of-2d-array-in-c-using-mpi I think my answer is in there. If I get something together I will post – comfychair Apr 04 '19 at 01:13