1

This is a follow-up to this previous question of mine, for which the conclusion was that the program was erroneous, and therefore the expected behavior was undefined.

What I'm trying to create here is a simple error-handling mechanism, for which I use that Irecv request for the empty message as an "abort handle", attaching it to my normal MPI_Wait call (and turning it into MPI_WaitAny), in order to allow me to unblock process 1 in case an error occurs on process 0 and it can no longer reach the point where it's supposed to post the matching MPI_Recv.

What's happening is that, due to internal message buffering, the MPI_Isend may succeed right away, without the other process being able to post the matching MPI_Recv. So there's no way of canceling it anymore.

I was hoping that once all processes call MPI_Comm_free I can just forget about that message once and for all, but, as it turns out, that's not the case. Instead, it's being delivered to the MPI_Recv in the following communicator.

So my questions are:

  1. Is this also an erroneous program, or is it a bug in the MPI implementation (Intel MPI 4.0.3)?
  2. If I turn my MPI_Isend calls into MPI_Issend, the program works as expected - can I at least in that case rest assured that the program is correct?
  3. Am I reinventing the wheel here? Is there a simpler way to achieve this?

Again, any feedback is much appreciated!


#include "stdio.h"
#include "unistd.h"
#include "mpi.h"
#include "time.h"
#include "stdlib.h"

int main(int argc, char* argv[]) {
    int rank, size;
    MPI_Group group;
    MPI_Comm my_comm;

    srand(time(NULL));
    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_group(MPI_COMM_WORLD, &group);

    MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
    if (rank == 0) printf("created communicator %d\n", my_comm);

    if (rank == 1) {
        MPI_Request req[2];
        int msg = 123, which;

        MPI_Isend(&msg, 1, MPI_INT, 0, 0, my_comm, &req[0]);
        MPI_Irecv(NULL, 0, MPI_INT, 0, 0, my_comm, &req[1]);

        MPI_Waitany(2, req, &which, MPI_STATUS_IGNORE);

        MPI_Barrier(my_comm);

        if (which == 0) {
            printf("rank 1: send succeed; cancelling abort handle\n");
            MPI_Cancel(&req[1]);
            MPI_Wait(&req[1], MPI_STATUS_IGNORE);
        } else {
            printf("rank 1: send aborted; cancelling send request\n");
            MPI_Cancel(&req[0]);
            MPI_Wait(&req[0], MPI_STATUS_IGNORE);
        }
    } else {
        MPI_Request req;
        int msg, r = rand() % 2;
        if (r) {
            printf("rank 0: receiving message\n");
            MPI_Recv(&msg, 1, MPI_INT, 1, 0, my_comm, MPI_STATUS_IGNORE);
        } else {
            printf("rank 0: sending abort message\n");
            MPI_Isend(NULL, 0, MPI_INT, 1, 0, my_comm, &req);
        }

        MPI_Barrier(my_comm);

        if (!r) {
            MPI_Cancel(&req);
            MPI_Wait(&req, MPI_STATUS_IGNORE);
        }
    }

    if (rank == 0) printf("freeing communicator %d\n", my_comm);
    MPI_Comm_free(&my_comm);

    sleep(2);

    MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
    if (rank == 0) printf("created communicator %d\n", my_comm);

    if (rank == 0) {
        MPI_Request req;
        MPI_Status status;
        int msg, cancelled;

        MPI_Irecv(&msg, 1, MPI_INT, 1, 0, my_comm, &req);
        sleep(1);

        MPI_Cancel(&req);
        MPI_Wait(&req, &status);
        MPI_Test_cancelled(&status, &cancelled);

        if (cancelled) {
            printf("rank 0: receive cancelled\n");
        } else {
            printf("rank 0: OLD MESSAGE RECEIVED!!!\n");
        }
    }

    if (rank == 0) printf("freeing communicator %d\n", my_comm);
    MPI_Comm_free(&my_comm);

    MPI_Finalize();
    return 0;
}

outputs:

created communicator -2080374784
rank 0: sending abort message
rank 1: send succeed; cancelling abort handle
freeing communicator -2080374784
created communicator -2080374784
rank 0: STRAY MESSAGE RECEIVED!!!
freeing communicator -2080374784
Community
  • 1
  • 1
i.adri
  • 56
  • 4
  • The MPI standard says that when an MPI_Isend is canceled, it must either complete normally or not be received at all on the receiving side. Since your message is apparently small enough to be delivered immediately, the MPI_Cancel is ignored by the implementation. That is why you are seeing it have no effect. – kraffenetti May 27 '14 at 16:09
  • I assume you're referring to the last `MPI_Cancel` call. It would have been totally fine to receive that message if it was in the same communicator, but, as you can see, that's a different one. – i.adri May 28 '14 at 08:13
  • 1
    You program is still erroneous if it has unmatched sends/recvs. – kraffenetti May 28 '14 at 12:25
  • possible duplicate of [MPI message received in different communicator](http://stackoverflow.com/questions/23807958/mpi-message-received-in-different-communicator) – Wesley Bland May 28 '14 at 18:58

2 Answers2

2

As mentioned in one of the above comments by @kraffenetti, this is an erroneous program because the sent messages are not being matched by receives. Even though the messages are cancelled, they still need to have a matching receive on the remote side because it's possible that the cancel might not be successful for sent messages due to the fact that they were already sent before the cancel can be completed (which is the case here).

This question started a thread on this on a ticket for MPICH, which you can find here that has more details.

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
  • Ok, thanks for answering the first part of my question. What I'd also like to know is whether your statement above still holds true if I replace `MPI_Isend` with `MPI_Issend`? Or is the program correct in that case? – i.adri May 29 '14 at 08:41
  • Yes. However, I think it will run anyway. – Wesley Bland May 29 '14 at 12:11
  • So are you suggesting that even [this very simple MPI program](http://pastebin.com/5mJ8Dyqz) is erroneous and its expected behaviour is undefined? If so, then I'm totally confused - what is the purpose of MPI_Cancel() then? Could you please point me to the relevant section of the MPI standard, or any other resource that should clarify this for me? I would really appreciate it! – i.adri May 29 '14 at 20:07
  • I'm not sure I can point to one. You might be right that doing an `MPI_ISSEND` would be correct. It would certainly be sufficient. – Wesley Bland May 30 '14 at 22:03
0

I tried to build your code using open mpi and it did not work. mpicc complained about status.cancelled

  error: ‘MPI_Status’ has no member named ‘cancelled’

I suppose this is a feature of intel mpi. What happens if you switch for :

    ...
    int flag;
    MPI_Test_cancelled(&status, &flag);
    if (flag) {
    ...

This gives the expected output using open mpi (and it makes your code less dependant). Is it the case using intel mpi ?

We need an expert to tell us what is status.cancelled in intel mpi, because i don't know anything about it !

Edit : i tested my answer many times and i found that the output was random, sometimes correct, sometimes not. Sorry for that... As if something in status was not set. Part of the answer may be in MPI_Wait(), http://www.mpich.org/static/docs/v3.1/www3/MPI_Wait.html ,

" The MPI_ERROR field of the status return is only set if the return from the MPI routine is MPI_ERR_IN_STATUS. That error class is only returned by the routines that take an array of status arguments (MPI_Testall, MPI_Testsome, MPI_Waitall, and MPI_Waitsome). In all other cases, the value of the MPI_ERROR field in the status is unchanged. See section 3.2.5 in the MPI-1.1 specification for the exact text. " If MPI_Test_cancelled() makes use of the MPI_ERROR, things might get bad.

So here is the trick : use MPI_Waitall(1,&req, &status) ! The output is correct at last !

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
francis
  • 9,525
  • 2
  • 25
  • 41
  • Thanks for your remark! I updated the code to use MPI_Test_cancelled(). Unfortunately, it doesn't make any difference - the behavior is still the same. – i.adri May 27 '14 at 07:12
  • I tried the trick you suggested, but it doesn't seem to change anything for me. – i.adri May 28 '14 at 08:14