1

I have a nested loop and from inside the loop I call the MPI send which I want it to send to the receiver a specific value then at the receiver takes the data and again sends MPI messages to another set of CPUs ... I used something like this but it looks like there is a problem in the receive ... and I cant see where I went wrong ..."the machine goes to infinite loop somewhere ...

I am trying to make it work like this : master CPU >> send to other CPUs >> send to slave CPUs

 . 
 . 
 . 

 int currentCombinationsCount; 
 int mp; 

 if (rank == 0)
 {


     for (int pr = 0; pr < combinationsSegmentSize; pr++)
     {
         int CblockBegin = CombinationsSegementsBegin[pr];
         int CblockEnd   = CombinationsSegementsEnd  [pr];
         currentCombinationsCount = numOfCombinationsEachLoop[pr]; 
         prossessNum = 1; //specify which processor we are sending to 

         // now substitute and send to the main Processors 
         for (mp = CblockBegin; mp <= CblockEnd; mp++)
         {

             MPI_Send(&mp , 1, MPI_INT   , prossessNum, TAG, MPI_COMM_WORLD);

             prossessNum ++; 
         }

     }//this loop goes through all the specified blocks for the combinations  
 } // end of rank 0
 else if (rank > currentCombinationsCount)
 {
       // here I want to put other receives that will take values from the else below 
 }
 else 
 {
     MPI_Recv(&mp , 1, MPI_INT   , 0, TAG, MPI_COMM_WORLD, &stat);
     // the code stuck here in infinite loop 
 }
SOSO
  • 69
  • 1
  • 2
  • 4

1 Answers1

0

You've only initialised currentCombinationsCount within the if(rank==0) branch so all other procs will see an uninitialised variable. That will result in undefined behaviour and the outcome depends on your compiler. Your program may crash or the value may be set to 0 or an undetermined value.

If you're lucky, the value may be set to 0 in which case your branch reduces to:

if (rank == 0) {  /* rank == 0 will enter this } 
else if (rank > 0) { /* all other procs enter this }
else { /* never entered! Recvs are never called to match the sends */ }

You therefore end up with sends that are not matched by any receives. Since MPI_Send is potentially blocking, the sending proc may stall indefinitely. With procs blocking on sends, it can certainly look as thought "...the machine goes to infinite loop somewhere...".

If currentCombinationsCount is given an arbitrary value (instead of 0) then rank!=0 procs will enter arbitrary branchss (with a higher chance of all entering the final else). You then end up with second set of receives not being called resulting in the same issue as above.

Community
  • 1
  • 1
Shawn Chin
  • 84,080
  • 19
  • 162
  • 191
  • Thank you Shawn for answer but I am not using the currentCombinationsCount yet in anything ... I want to make sure that mp reached with the desired value in each iteration so I commented the currentCombinationsCount inside the loop and still the program stuck somewhere ... do you think there could be a logical problem ?? – SOSO Jan 27 '13 at 09:16
  • You are using it in your branch conditions, so yes it appears to be a problem with your program logic. – Shawn Chin Jan 29 '13 at 19:08