I'm trying to combine two arrays (each of length n) into the receiving buffer on root process (rank=0) to form a array of length 2*n, i.e. a single array containing all the values.
For brevity, my code resembles the following:
#define ROOT 0
int myFunction(int* rBuf, int n) {
int* sBuf = malloc(n*sizeof(int));
// Do work, calculate offset, count etc.
MPI_Reduce(sBuf, rBuf+offset[rank], counts[rank],
MPI_INT, MPI_SUM, ROOT, MPI_COMM_WORLD);
}
// where offset[rank] is amount to offset where it is to be received
// offset[0] = 0, offset[1] = n
// counts contains the length of arrays on each process
However when I check rBuf, it is reduced to rBuf without the offset, for example:
// Rank 0: sBuf = {3, 2}
// Rank 1: sBuf = {5, 1}
// Should be rBuf = {3, 2, 5, 1}
rBuf = {8, 3, 0, 0}
Additional info:
- rBuf is initialized to correct size with 0s in values prior to reduce
- All processes have the offset array
- Reason for using MPI_Reduce at the time was if the rBuf is set to 0s then reduce with MPI_SUM would give the needed answer
I've looked up documentation, some tutorials/guides online and of course SO and I still can't figure out what I'm doing wrong.
For an answer, I'm specifically looking for:
- Is this technically possible using MPI_Reduce?
- Is my MPI_Reduce call correct? (error in pointer arithmetic?)
- Is feasible/right practice using MPI or is a better approach?
Thanks