0

I have written an MPI code in C++ for my Raspberry Pi cluster, which generates an image of the Mandelbrot Set. What happens is on each node (excluding the master, processor 0) part of the Mandelbrot Set is calculated, resulting in each node having a 2D array of ints that indicates whether each xy point is in the set.

It appears to work well on each node individually, but when all the arrays are gathered to the master using this command: MPI_Gather(&inside, 1, MPI_INT, insideFull, 1, MPI_INT, 0, MPI_COMM_WORLD); it corrupts the data, and the result is an array full of garbage. (inside is the nodes' 2D arrays of part of the set. insideFull is also a 2D array but it holds the whole set) Why would it be doing this?

(This led to me wondering if it corrupting because the master isn't sending its array to itself (or at least I don't want it to). So part of my question also is is there an MPI_Gather variant that doesn't send anything from the root process, just collects from everything else?)

Thanks

EDIT: here's the whole code. If anyone can suggest better ways of how I'm transferring the arrays, please say.

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>


// ONLY USE MULTIPLES OF THE NUMBER OF SLAVE PROCESSORS
#define ImageHeight  128
#define ImageWidth   128

double MinRe = -1.9;
double MaxRe = 0.5;
double MinIm = -1.2;
double MaxIm = MinIm + (MaxRe - MinRe)*ImageHeight / ImageWidth;

double Re_factor = (MaxRe - MinRe) / (ImageWidth - 1);
double Im_factor = (MaxIm - MinIm) / (ImageHeight - 1);
unsigned n;

unsigned MaxIterations = 50;

int red;
int green;
int blue;

// MPI variables ****
int processorNumber;
int processorRank;
//*******************//

int main(int argc, char** argv) {

  // Initialise MPI
  MPI_Init(NULL, NULL);

  // Get the number of procesors
  MPI_Comm_size(MPI_COMM_WORLD, &processorNumber);

  // Get the rank of this processor
  MPI_Comm_rank(MPI_COMM_WORLD, &processorRank);

  // Get the name of this processor
  char processorName[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processorName, &name_len);

  // A barrier just to sync all the processors, make timing more accurate
  MPI_Barrier(MPI_COMM_WORLD);

  // Make an array that stores whether each point is in the Mandelbrot Set
  int inside[ImageWidth / processorNumber][ImageHeight / processorNumber];

  if(processorRank == 0) {
    printf("Generating Mandelbrot Set\n");
  }

  // We don't want the master to process the Mandelbrot Set, only the slaves
  if(processorRank != 0) {
    // Determine which coordinates to test on each processor
    int xMin = (ImageWidth / (processorNumber - 1)) * (processorRank - 1);
    int xMax = ((ImageWidth / (processorNumber - 1)) * (processorRank - 1)) - 1;
    int yMin = (ImageHeight / (processorNumber - 1)) * (processorRank - 1);
    int yMax = ((ImageHeight / (processorNumber - 1)) * (processorRank - 1)) - 1;

    // Check each value to see if it's in the Mandelbrot Set
    for (int y = yMin; y <= yMax; y++) {
      double c_im = MaxIm - y  *Im_factor;
      for (int x = xMin; x <= xMax; x++) {
        double c_re = MinRe + x*Re_factor;
        double Z_re = c_re, Z_im = c_im;
        int isInside = 1;
        for (n = 0; n <= MaxIterations; ++n) {
          double Z_re2 = Z_re * Z_re, Z_im2 = Z_im * Z_im;
          if (Z_re2 + Z_im2 > 10) {
            isInside = 0;
            break;
          }
          Z_im = 2 * Z_re * Z_im + c_im;
          Z_re = Z_re2 - Z_im2 + c_re;
        }
        if (isInside == 1) {
          inside[x][y] = 1;
        }
        else{
          inside[x][y] = 0;
        }
      }
    }
  }

  // Wait for all processors to finish computing
  MPI_Barrier(MPI_COMM_WORLD);

  int insideFull[ImageWidth][ImageHeight];

  if(processorRank == 0) {
    printf("Sending parts of set to master\n");
  }

  // Send all the arrays to the master
  MPI_Gather(&inside[0][0], 1, MPI_INT, &insideFull[0][0], 1, MPI_INT, 0, MPI_COMM_WORLD);

  // Output the data to an image
  if(processorRank == 0) {
    printf("Generating image\n");
    FILE * image = fopen("mandelbrot_set.ppm", "wb");
    fprintf(image, "P6 %d %d 255\n", ImageHeight, ImageWidth);
    for(int y = 0; y < ImageHeight; y++) {
      for(int x = 0; x < ImageWidth; x++) {
        if(insideFull[x][y]) {
          putc(0, image);
          putc(0, image);
          putc(255, image);
        }
        else {
          putc(0, image);
          putc(0, image);
          putc(0, image);
        }
        // Just to see what values return, no actual purpose
        printf("%d, %d, %d\n", x, y, insideFull[x][y]);
      }
    }
    fclose(image);
    printf("Complete\n");
  }

  MPI_Barrier(MPI_COMM_WORLD);

  // Finalise MPI
  MPI_Finalize();
}
  • Looks like you look for some documentation. Google a little bit and have alook at this: http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node70.html. After, post a little bit more code if it still doesn't work. – Christophe Jul 06 '14 at 10:27
  • mpi_gatherv? But isnt that just to allow an offset in the resultant array between each processor's arrays? I don't need any of that, just not the root process sending to itself. –  Jul 06 '14 at 10:38
  • If `inside` is a 2D array, you could try `MPI_Gather(&inside[0][0], nx*ny, MPI_INT, &insideFull[0][0], nx*ny, MPI_INT, 0, MPI_COMM_WORLD);`. If it does not work, could you tell us more about the way you allocated these arrays ? If nx or ny are different on each node, you will have to use `MPI_Gatherv()`. Bye – francis Jul 06 '14 at 12:51
  • No it still doesnt work... I'll post the whole code so you can see –  Jul 06 '14 at 12:57
  • Interesting... I just wrote up another unrelated code that has to move arrays and it resulted in the same garbage output –  Jul 06 '14 at 14:04
  • 2
    There are a number of Q&As here which refer to using gather/scatter for 2D subarrays - [here is a fairly lengthy one](http://stackoverflow.com/a/9271753/463827) to get you started. – Jonathan Dursi Jul 06 '14 at 21:39

1 Answers1

0

You call MPI_Gether with the following parameters:

  const void* sendbuf : &inside[0][0]      Starting address of send buffer
  int sendcount : 1                        Number of elements in send buffer
  const MPI::Datatype& sendtype : MPI_INT  Datatype of send buffer elements
  void* recvbuf : &insideFull[0][0]
  int recvcount : 1                        Number of elements for any single receive
  const MPI::Datatype& recvtype : MPI_INT  Datatype of recvbuffer elements
  int root : 0                             Rank of receiving process
  MPI_Comm comm : MPI_COMM_WORLD           Communicator (handle).

Sending/receiving only one element is not sufficient. Instead of 1 use

 (ImageWidth / processorNumber)*(ImageHeight / processorNumber) 

Then think about the different memory layout of your source and target 2D arrays:

 int inside[ImageWidth / processorNumber][ImageHeight / processorNumber];

vs.

int insideFull[ImageWidth][ImageHeight];

As the copy is a memory bloc copy, and not an intelligent 2D array copy, all your source integers will be transfered contiguously to the target adress, regardless of the different size of the lines.

I'd recommend to send the data fisrt into an array of the same size as the source, and then in the receiving process, to copy the elements to the right lines & columns in the full array, for example with a small function like:

// assemble2d(): 
// copys a source int sarr[sli][sco] to a destination int darr[dli][sli] 
// using an offset to starting at darr[doffli][doffco].  
// The elements that are out of bounds are ignored.  Negative offset possible. 
void assemble2D(int*darr, int dli, int dco, int*sarr, int sli, int sco, int doffli=0, int doffco=0)
{
    for (int i = 0; i < sli; i++) 
        for (int j = 0; j < sco; j++)
            if ((i + doffli >= 0) && (j + doffco>=0) && (i + doffli<dli) && (j + doffco<dco))
                darr[(i+doffli)*dli + j+doffco] = sarr[i*sli+j];
}
Christophe
  • 68,716
  • 7
  • 72
  • 138
  • Yes that sounds like a better idea –  Jul 06 '14 at 14:45
  • Using MPI scatter and gather to work with 2D subarrays is definitely possible by carefully constructing a special MPI datatype. This is a somewhat involved procedure - see the comment by Jonathan Dursi that contains a reference to his excellent answer on how to do it. – Hristo Iliev Jul 07 '14 at 11:10