0

The code simply allocates memory for a matrix and uses non-blocking procedure to send the matrix from rank 0 to rank 1. It works fine for a smaller matrix size (1024). But it results in segmentation fault with a larger size (16384); Below is the code

    double **A;
    int i,j,size,rankid,rankall;
    size = 16384;
    MPI_Request reqr,reqs;
    MPI_Status star,stas;
    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD,&rankall);
    MPI_Comm_rank(MPI_COMM_WORLD,&rankid);
    A = (double**)calloc(size,sizeof(double*));
    for(i=0;i<size;i++){
            A[i] = (double *)calloc(size,sizeof(double));
            for(j=0;j<size;j++){
                    if(rankid ==0){
                            A[i][j] = 1;
                    }
            }
    }
    if(rankid ==0){
            MPI_Isend(&A[0][0],size*size,MPI_DOUBLE,1,1,MPI_COMM_WORLD,&reqs);
            MPI_Wait(&reqs,&stas);
    }
    if(rankid ==1){
            MPI_Irecv(&A[0][0],size*size,MPI_DOUBLE,0,1,MPI_COMM_WORLD,&reqr);
            MPI_Wait(&reqr,&star);
    }

    MPI_Finalize();

debug showed

#0 0x00007FFFF7947093 in ?? () From /1ib/x86_64-1inux-gnu/libc.so.6
#1 0x000000000043a5B0 in MPID_Segment_contig_m2m ()
#2 0x00000000004322cb in MPID_Segment_manipulate ()
#3 0x000000000043a?Ba in MPID_Segment_pack ()
#4 0x000000000042BB99 in lmt_shm_send_progress ()
#5 0x000000000042?e1F in MPID_nem_lmt_shm_start_send ()
#6 0x0000000000425aFF in pkt_CTS_handler ()
#? 0x000000000041Fb52 in MPIDI_CH3I_Progress ()
#8 0x0000000000405Bc1 in MPIR_Wait_impl ()
#9 0x000000000040594e in PMPI_Wait ()
#10 0x0000000000402ea5 in main (argc=1,argv=0x7fffffffe4a8)
at ./simpletest.c:26
Community
  • 1
  • 1
  • 1
    Have you tried checking to make sure the pointer from your calloc calls are not NULL? Each rank is allocating 2 GB for a 16k x 16k Matrix, and you may be running out of memory (I don't know what your systems is). – Dr.Tower Apr 10 '14 at 20:46
  • 2
    This comes up frequently here with C "multidimensional arrays" and MPI. (See for instance an answer of mine - http://stackoverflow.com/questions/5104847/mpi-bcast-a-dynamic-2d-array/5107489#5107489 - but there are many other examples). The problem is you're trying to send/receive `size*size` `MPI_DOUBLE`s from/to `&A[0][0]`, but it's extremely unlikely that this is how the allocated memory is actually laid out. You'll have to allocate the size*size doubles in one chunk, and then allocate the pointers into it - this gives you the memory layout you normally want anyway in numeric code. – Jonathan Dursi Apr 10 '14 at 21:09

0 Answers0