-1

When I tried to run an MPI program but failed. It said:

job aborted:

[ranks] message

[0] process exited without calling finalize

[1-3] terminated

error analysis says the exit code is 0xc0000005.

Then I google it, someone said use MPI_Init_thread instead, but it gave me 255 for the exit code.

How can I fix it? What's wrong with the rank 0 process?

Here is the code fragment using MPI to send and receive data:

        // MPI things
    MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
    // master
    if (taskid == 0)
    {
        //printf("taskid: %d", taskid);
        average = Nchunk / Nworkers;
        extra = Nchunk % Nworkers;
        mtype = FROM_MASTER;
        offset = 0;

        // store volume[Itemp[n]]
        for (int i = 0; i < Nchunk; i++)
        {
            volumeTemp[i] = volume[Itemp[i]];
        }

        // send to slave
        for (int dest = 1; dest <= Nworkers; dest++)
        {

            Nelements = (dest <= extra) ? average + 1 : average;
            MPI_Send(&Nelements, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
            MPI_Send(&offset, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
            MPI_Send(&Itemp[offset], Nelements, MPI_INT, dest, mtype, MPI_COMM_WORLD);
            MPI_Send(&SMtemp[offset], Nelements, MPI_FLOAT, dest, mtype, MPI_COMM_WORLD);
            MPI_Send(&volumeTemp[offset], Nelements, MPI_FLOAT, dest, mtype, MPI_COMM_WORLD);
            offset = offset + Nelements;
        }


        // receive result from slave
        mtype = FROM_WORKERS;
        for (int source = 1; source <= Nworkers; source++)
        {
            //MPI_Recv(&average, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
            //MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
            MPI_Recv(&sinogram[ns], 1, MPI_FLOAT, source, mtype, MPI_COMM_WORLD, &status);
        }


    }
    //printf("taskid: %d", taskid);

    // slave
    if (taskid > 0)
    {
        //printf("taskid: %d", taskid);
        mtype = FROM_MASTER;
        MPI_Recv(&Nelements, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD, &status);
        MPI_Recv(&offset, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD, &status);
        MPI_Recv(&Itemp[offset], Nelements, MPI_INT, MASTER, mtype, MPI_COMM_WORLD, &status);
        MPI_Recv(&SMtemp[offset], Nelements, MPI_INT, MASTER, mtype, MPI_COMM_WORLD, &status);
        MPI_Recv(&volumeTemp, Nelements, MPI_FLOAT, MASTER, mtype, MPI_COMM_WORLD, &status);

        for (int i = 0; i < average; i++)
        {
            if (fabs(volumeTemp[i]) > 1.0e-14)
                sinogram[ns] = sinogram[ns] + volumeTemp[i] * SMtemp[i];
        }

        //send to master
        mtype = FROM_WORKERS;
        MPI_Send(&sinogram[ns], 1, MPI_FLOAT, MASTER, mtype, MPI_COMM_WORLD, &status);
    }
Heran
  • 19
  • 1
  • 1
  • 8

1 Answers1

1

The exit codes of MPI rarely mean anything since you have multiple processes that are all returning their own error codes. It's much more helpful to rely on the error messages that the program spits out. Luckily for you, your program did!

[0] process exited without calling finalize

That could mean one of two things;

  1. Your program finished, but didn't call MPI_Finalize. That's a pretty easy fix. Check to make sure that everywhere your program can terminate normally, it calls MPI_Finalize. That may or may not be your problem though...
  2. Your program terminated abnormally. This is usually harder to track down and will probably require some of the usual MPI debugging tricks. We're probably not going to be able to fix your problem here if that's the problem unless your code is trivially small or you follow the guidelines on creating a good example.
Community
  • 1
  • 1
Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
  • Thank you. I'll try to debug. I do MPI_Init and MPI_Finalize in main() and do MPI things in another function which is called in main. – Heran Oct 18 '15 at 20:03