0

When I trying to run this code Message "DEADLOCK: attempting to send a message to the local process without a prior matching receive"

#include "pch.h"
#include <iostream>
#include <mpi.h>
using namespace std;

void main(int argc, char* argv[])
{   int ierr, procid, numprocs;
    ierr = MPI_Init(&argc, &argv);
    ierr = MPI_Comm_rank(MPI_COMM_WORLD, &procid);
    ierr = MPI_Comm_size(MPI_COMM_WORLD, &numprocs);

    // All procids send the value - procid to procid 0
    double val = -1.0 * procid;
    MPI_Send(&val, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
    cout << "ProciD " << procid << " send value " << val << " to procid 0.\n";
    if (procid == 0)
    {
        // procid 0 must recieve numprocs values
        int i; double val, sum = 0; MPI_Status status;
        for (i = 0; i != numprocs; ++i)
        {
            ierr = MPI_Recv(&val, 1, MPI_DOUBLE, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
            if (ierr == MPI_SUCCESS)
            {
                cout << "Procid " << procid << " recieve value " << val;
                sum = sum + val;
            }
            else
                MPI_Abort(MPI_COMM_WORLD, 1);
        }
        cout << " The Total is " << sum << "\n";
    }
    ierr = MPI_Finalize();
}

I don't understand why this error happend

1 Answers1

0

How much clearer do you want an error message to be? It talks about a send to the local process. So that would be that first send in your code. Which is to process zero. Therefore "local process" refers to zero.

And apparently process zero is doing a send to itself without first doing a receive. Again, true. The problem is that a send can only succeed if there a receive waiting for it. If there is no receive, like here, then the send will sit and wait forever. So your code deadlocks.

Ok, that's the theory. In practice, MPI implementation sometimes allow you to do a send in a deadlocking scenario, and get no deadlock. That's called an "eager send". I conclude that your MPI does not allow eager sends, and @jjramsey's MPI does, which is why your problem is not totally reproducible.

For the solution. The error message says that you first need to do a receive. But that would also give deadlock. So you have two options: 1. do not send to yourself, only copy. Or 2. post a bunch of MPI_Irecv before you do the sends. Or 3. use MPI_Isend for the sends. Then they can come first.

Victor Eijkhout
  • 5,088
  • 2
  • 22
  • 23
  • There is a wrinkle in how the MPI standard defines a blocking send. As discussed in the section [Communication Modes](https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node57.htm), "it does not return until the message data and envelope have been safely stored away so that the sender is free to modify the send buffer." That does not *necessarily* require a matching receive to have been executed, so "eager send" is a valid but optional implementation. – jjramsey May 12 '22 at 17:55
  • @jjramsey point granted. – Victor Eijkhout May 12 '22 at 18:12
  • regardless the implementation, this program is not valid w.r.t. the MPI standard since it **might** block. FWIW, the right fix here is to use `MPI_Reduce()` – Gilles Gouaillardet May 12 '22 at 23:17
  • @GillesGouaillardet the program could also just print out `p*(p-1)/2` :-) – Victor Eijkhout May 12 '22 at 23:57
  • Maths definitely beat MPI :-) – Gilles Gouaillardet May 13 '22 at 03:58