MPI parallelize one function

Question

I want to parallelize an numerical integration function. I want to use this function in the middle of calculation. The work before should be done in root process. Is this possible to do in MPI?

 double integral_count_MPI(double (*function)(double) , double beginX, double endX, int      count)
 {
    double step, result;
    int i;
    if (endX - beginX <= 0) return 0;
    step = (endX - beginX) / count;
    result = 0;

    double *input = (double*)malloc((count+1) *sizeof(double));
    for (i = 0; i <= count; i ++)
    {
       input[i] = beginX + i*step;
    }
    // Calculate and gather
 }

EDIT

algorithm:

  1 process calculation;
while:
  1 process calculation;
  integration very complex function with many processes;
  1 process calculation;
end while;
1 process calculation;

Yes, it is. I hope you to not expect me to write the code for you? — kaspersky, Dec 16 '12 at 10:28

score 3 · Accepted Answer · answered Dec 16 '12 at 13:16

3

MPI provides various means to build libraries that use it "behind the scenes". For starters, you can initialise MPI on demand. MPI-2 modified the requirements for calling MPI_Init so every compliant implementation should be able to correctly initialise with NULL arguments to MPI_Init (because the actual program arguments might not be available to the library). Since MPI should only be initialised once, the library must check if it was already initialised by calling MPI_Initialized. The code basically looks like this:

void library_init(void)
{
   int flag;

   MPI_Initialized(&flag);
   if (!inited)
   {
      MPI_Init(NULL, NULL);
      atexit(library_onexit);
   }
}

The initialisation code also registers an exit handler by calling atexit() from the C standard library. Within this exit handler it finalises the MPI library if it not already finalised. Failure to do so might result in mpiexec terminating the whole MPI job with a message that at least one process has exited without finalising MPI:

void library_onexit(void)
{
   int flag;

   MPI_Finalized(&flag);
   if (!flag)
      MPI_Finalize();
}

This arrangement allows you to write your integral_count_MPI function simply like:

double integral_count_MPI(...)
{
    library_init();

    ... MPI computations ...
}

integral_count_MPI will demand-initialise the MPI library on the first call. Later calls will not result in reinitialisation because of the way library_init is written. Also no explicit finalisation is necessary - the exit handler will take care.

Note that you will still need to launch the code via the MPI process launcher (mpirun, mpiexec, etc.) and will have to be careful with doing I/O, since the serial part of the code would execute in each instance. Many MPI-enabled libraries provide their own I/O routines for that purpose that filter on the process rank and allow only rank 0 to perform the actual I/O. You can also use the dynamic process management facilities of MPI to spawn additional processes on demand, but that would require that you abstract a huge portion of the process management into the library that performs the integration, which would make it quite complex (and the code of your main program would look awkward).

answered Dec 16 '12 at 13:16

Hristo Iliev

72,659
12
135
186

It's amazing!! I'll try this solution in the evening. Hope it works. – T_T Dec 16 '12 at 13:38
1

@T_T, also note another part of the beauty of this concept: the library would not install an exit handler if MPI was already initialised in the main program. It means that you can safely put your own `MPI_Init() ... MPI_Finalize()` calls in `main` and it would still function correctly (but not if you call `MPI_Finalize()` and then again call `integral_count_MPI`). – Hristo Iliev Dec 16 '12 at 13:46
Are there others side effects besides I/O ? – T_T Dec 16 '12 at 16:49
One more question. The processes created with MPI continue execution a code outside the function. Can it be fixed? – T_T Dec 16 '12 at 17:15
@T_T, well, I/O is the only true side effect that a program can have, sans using some kernel magic to crash or trash the system :) You could either query the library for "master mode" (basically a boolean function that returns the comparison of the current rank with `0`) and skip parts of the code if not master. – Hristo Iliev Dec 16 '12 at 22:08
1

Another solution would be to write two completely separate programs - one for the master and one for the workers, then launch them like `mpiexec -n 1 master.exe : -n 20 worker.exe`. Rank 0 would execute `master.exe`, all the other twenty ranks would execute `worker.exe`. Data can be exchanged as usual using normal MPI messaging. This is the so-called MPMD (Multiple Programs Multiple Data) mode of MPI as opposed to the Single Program Multiple Data where all MPI processes execute the same program (i.e. have the same source code). – Hristo Iliev Dec 16 '12 at 22:10
This looks very nice! But I got a bit confused. I think the hard part is to do parallel computation inside a loop in `main`. Even using your code, it looks as if `main` is still something like the other answer to this question? Or I misunderstand anything? – xiaohuamao May 30 '19 at 07:39
@xiaohuamao My answer concerns libraries that make use of MPI behind the scenes in general. The point is to to not have any MPI calls in your `main` function or anywhere else in the "driver" part of the program, which allows for a parallel library to be a drop-in replacement of a sequential one. – Hristo Iliev May 31 '19 at 19:51
Yes, that's right. But since we call MPI somewhere, we still need to specify `if (rank == 0) else ...` and modify things in other parts in `main` or elsewhere, right? It doesn't seem to be a simple drop-in replacement. Actually, I learned from [another answer of you](https://stackoverflow.com/a/40103367/3262115) to write my program for doing parallel computation in a subfunction called many times by a loop. Please kindly let me know if I misunderstood. – xiaohuamao May 31 '19 at 20:12
@xiaohuamao The point is to have all the `if (rank == 0) ... else ...` logic in the library and not expose the dependency on MPI in any other way besides the need to link with an MPI implementation library and use MPI's process launcher to start the program. From the user's perspective there is a call to `do_something()` and it somehow runs in parallel. – Hristo Iliev Jun 03 '19 at 10:58
Thanks for your reply. Probably what I mean is the following. Let's say we work in a single source code file with `main` and all other functions in it or maybe calling library is the same. If we use your codes presented here only in the routine that we want to parallelise and we modify nothing else in order to hide/encapsulate the MPI dependence, then for instance, codes in `main` will also run in parallel automatically, right? Instead, if we specify `if (rank == 0) else ...` and the like, then we force the originally serial part to run serially. Is this understanding correct? – xiaohuamao Jun 03 '19 at 17:59
I am a complete novice to MPI. Sometimes I have the concern that the former might underperform the latter because it does extra unnecessary, albeit parallel, things. But from your experience, maybe it's not the case and we don't need to worry about this? – xiaohuamao Jun 03 '19 at 17:59
@xiaohuamao, it boils down to programming style. If your computational logic is spread all over the program, it might require lots of changes, including to your `main` function, and the use of MPI won't be that transparent. If it is concentrated in a set of library functions, e.g., like in BLAS and LAPACK, and your program is mostly a "driver" feeding data in those library routines, then the changes can be well isolated and the use of MPI could be well contained to the point of being almost hidden. – Hristo Iliev Jun 04 '19 at 09:42
@xiaohuamao, of course, given that MPI programs are usually running multiple copies of the same executable, it might be necessary to have some MPI logic outside the library. For example, if your main function writes the results to a file, you most definitely don't want all copies writing uncoordinated to the same file. Thus, the answer to your question is, same as always, that it depends. – Hristo Iliev Jun 04 '19 at 09:46
@xiaohuamao, in high-performance computing it is common to dedicate a separate CPU core for each MPI rank, therefore having extra work done in parallel has low to no impact on the performance (but can greatly affect the energy consumption). But then again, if most of your computations are contained in some parallel kernel functions, the extra (repeated) serial work will be just a small fraction. – Hristo Iliev Jun 04 '19 at 10:47
Thank you so much for your kind and helpful clarification! – xiaohuamao Jun 04 '19 at 17:37

kaspersky · Answer 2 · 2012-12-16T11:27:12.210

2

You can find the MPI documentation here

Basically, the logic is the following:

int main()
{
    MPI_INIT(...);
    MPI_Comm_size(...);    //get the number of processes
    MPI_Comm_rank(...);    //get my rank

    if (rank == 0)     //master process
    {
        for (i = 1; i < n; i++)
            MPI_Send(...) //Send interval data specific to i process

        double result = 0;
        for (i = 1; i < n; i++)
        {
            double part_result;

            MPI_Recv(&part_result, ...) //Receive partial results from slaves

            result += part_result;
        }

        // Print result
    }
    else               //slave process
    {
        MPI_Recv(...)  //Receive interval data from master (rank 0 process)

        double result = integral_count_MPI(...);

        MPI_Send(...)  // Send results to master
    }

    MPI_FINALIZE(...);
}

edited Dec 16 '12 at 11:27

answered Dec 16 '12 at 11:21

kaspersky

3,959
4
33
50

Thank you for your answer. I must call this function not once. Is it normal to Initialize and finalize MPI several times? – T_T Dec 16 '12 at 11:36
1

No, it is not. Once you called MPI_Finalize, no other MPI calls are allowed (save for a few exceptions, see the documentation). – suszterpatt Dec 16 '12 at 11:38
1

No. This is your program source code. When running mpirun -np x ./program, mpi will spawn x processes from it. So you design your code from one process's perspective, so you don't need to call integral_count_MPI multiple times (from one process). integral_count_MPI will be called in each process, but each process will call it with different interval parameters (those that you'll pass to them), so it will compute a specific part of task. That's the whole point. – kaspersky Dec 16 '12 at 11:40
The documentation describes one-files examples. I have a complex program. And I can't Initialize MPI in main function. That's the problem. – T_T Dec 16 '12 at 11:43
Please describe what your complex program does. – kaspersky Dec 16 '12 at 11:54
@T_T, that kind of scenario isn't possible in MPI. Check this link: http://stackoverflow.com/questions/2015673/mpi-usage-in-windows/2291105#2291105. In MPI, there are created several processes right from the start, and they run in parallel till the end, so you must rethink your program from this perspective and redesign the solution. However, there are other ways to only make parallel a portion of the program (threads, openmp, forks). – kaspersky Dec 16 '12 at 12:17
@T_T, I believe you can redesign your program. Just instead of "1 process calculation" use "if (rank == 0) { // process calculation }". – kaspersky Dec 16 '12 at 12:48

MPI parallelize one function

2 Answers2