1

I'm running into strange behavior when trying to use MPI_Gatherv, and I must be misunderstanding something basic about the function. To demonstrate, I've simplified my code down to a toy problem that is hopefully easy to understand without much context.

  int * gatherv_counts = (int *)malloc(2*sizeof(int)); 
  int * gatherv_displacements = (int *)malloc(2*sizeof(int)); 
  double * receive_buffer = (double *)malloc(10*sizeof(double)); 

  for(int i = 0; i < 2; i++)
  {
    gatherv_counts[i] = 5;
    gatherv_displacements[i] = 0;
  }

  double * data = (double *)malloc(5*sizeof(double));
  for(int i = 0; i < 5; i++)
  {
      data[i] = (double)(mpirank + 2);
  }

  int mpiret = MPI_Gatherv( data, 5, MPI_DOUBLE, receive_buffer, gatherv_counts, gatherv_displacements, MPI_DOUBLE, 0, mpicomm); 

  if (mpirank == 0) {

    FILE *file = fopen("output.txt", "a");
    
    for (int i = 0; i < 10; i++) {
      fprintf (file, "%16.15e \n", receive_buffer[i]);
    }        
    
    fflush(file);
    fclose(file);
  }

I'm running this with 2 MPI processes. The data from each process is just 5 doubles (for a total combined size of 10), with values set to the rank+2. The counts are just hardcoded to 5, and displacements are all 0. It's about as simple a usage of MPI_Gatherv as I can imagine, and I'd expect that receive_buffer to be [2,2,2,2,2,3,3,3,3,3] after completion.

Instead, the receive_buffer is [3,3,3,3,3, 0,0,4.940656458412465e-324, 1.182342568274937e-316, 1.146400746224656e+248 ]. It seems to have completely skipped the data from rank 0 (the root of MPI_Gatherv), taken the data from rank 1, and left the remaining space filled with garbage. Could someone please explain what is going wrong here?

Also, for the record, I've seen a number of similarly titled questions, but these do not appear to be the same problem: MPI_Gatherv is not collecting data correctly

MPI_Gatherv: Garbage values received in root's array

Use of MPI_GATHERV where the "root" has no send buffer

Elmo
  • 79
  • 7
  • 1
    `displacements[1] = displacements[0] + counts[0]` – Victor Eijkhout Jul 27 '21 at 03:28
  • @VictorEijkhout oh of course. I'm an idiot. For some reason I was assuming displacements were relative to the data from the previous process. Thanks so much! If you write this as an answer I'll accept it. – Elmo Jul 27 '21 at 03:56

1 Answers1

2

You have all displacements at zero. Do:

displacements[1] = displacements[0] + counts[0]

and similar for higher rank counts.

Victor Eijkhout
  • 5,088
  • 2
  • 22
  • 23