0

I am trying to distribute the row of two 2D matrices using MPI_Iscatter(), but I am facing this error message : mpirun noticed that process rank 1 with PID 0 on node ***PC exited on signal 11 (Segmentation fault).

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#include<mpi.h>

int P;
int N = 1024;
int main(int argc, char *argv[]){

  MPI_Init(&argc, &argv);
  int i, j, k, rank, size;
  double start, end, total;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Request request[2];
  P = size;
  float A_row [N];
  float B_col [N];

  float matrix_A[N][N];
  float matrix_BT[N][N];

  if(rank == 0){
    double wall_time;
    for(i = 0; i < N; i++)
      for (j = 0; j < N; j++)
        matrix_A[i][j] = -1+2*((float)rand())/RAND_MAX;

    for(i = 0; i < N; i++)
      for (j = 0; j < N; j++)
        matrix_BT[i][j] = -1+2*((float)rand())/RAND_MAX;

  }
  start = MPI_Wtime();

  printf("Root processor Scatter is started for diagonal elements...\n");
  for(i = 0; i < N/P ; i += P){
    MPI_Iscatter(matrix_A[2+rank + i], N, MPI_FLOAT, A_row, N, MPI_FLOAT, 0, MPI_COMM_WORLD, &request[0]);
    MPI_Iscatter(matrix_BT[2+rank + i], N, MPI_FLOAT, B_col, N, MPI_FLOAT, 0, MPI_COMM_WORLD, &request[1]);
    MPI_Waitall(2,request, MPI_STATUSES_IGNORE);
    printf("Processor %d has recived the Scatter A & B elements...\n", rank);
  }

  end = MPI_Wtime();
  printf("Total Time: %f\n", end - start);



  MPI_Finalize();
}

1 Answers1

0

If I compile and run your code, I get a segmentation fault even if I comment out everything except:

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#include<mpi.h>

int P;
int N = 1024;

int main(int argc, char *argv[]){


  MPI_Init(&argc, &argv);
  int i, j, k, rank, size;
  double start, end, total;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Request request[2];
  P = size;
  float A_row [N];
  float B_col [N];

  float matrix_A[N][N];
  float matrix_BT[N][N];

  MPI_Finalize();
}

Do you get the same? I think you are hitting the stack memory limit. I think you should allocate your matrices on the heap instead.

Joe Todd
  • 814
  • 6
  • 13
  • Hi, did you try allocating dynamically to see what happens? – dreamcrash Nov 30 '20 at 16:57
  • I think I did, and get the same result. But malloc could be wrong. The weird thing about this is that when I run it on my PC it has the segmentation fault, but when I run it on remote HPC this part doesn't give segmentation faults. – Mahan Agha Zahedi Nov 30 '20 at 23:24
  • @joetodd would you explain where dynamic vs static memory allocation are physically are located? – Mahan Agha Zahedi Nov 30 '20 at 23:26
  • Lots of info online about stack vs heap. As to why you get different behaviour on different systems, that's probably because the stack size limit varies between machines. It may well be that your HPC system has larger stack size. If you want to quickly confirm that this bug is due to stack limit, try dropping N to 256 or something. – Joe Todd Dec 01 '20 at 10:48