Optimize writing to shared file with MPI

Question

In my MPI program, I need to write the results of some computation to a single (shared) file, where each MPI process writes its portion of the data at different offsets. Simple enough. I have implemented it like:

offset = rank * sizeof(double) * N;

for (i = 0; i < N; i++) {
   data = ...;
   MPI_File_write_at(file, offset, &data, 1, MPI_DOUBLE, &status);
   offset += sizeof(double);
}

This is a bit simplified, as data is actually an array, but let's assume that it is a single value for brevity. All the processes call this routine at the same time, and it works correctly. However, I am not sure is this the most optimal way to perform IO with MPI. Would using MPI_File_write_at_all or MPI_File_write_ordered lead to better performance?

Unfortunately, I have very limited time on the cluster (which has Lustre) so I cannot test all possible implementations extensively, and testing on my laptop will obviously not give me a good measure of IO performance.

In theory, the `_all` versions of the I/O routines should be favoured whenever possible. However, since these are collectives, you have to make sure that all processes on the communicator will reach them, or you risk a deadlock. In your case for example, you have to make sure that all processes have the same loop count `N`. Now, that said, I'm sure there's a way better solution for your I/O than doing it on a loop like this. Setting a view of the file with `MPI_File_set_view()` and doing one single call to `MPI_File_write_all()` is likely the most effective way. What do you really want to do? — Gilles, Mar 20 '16 at 10:58
`N` is the same for all processes, so I can use `_all` versions of routines. I need the loop because the `data` is computed in that loop. I can't move the IO routines outside of the loop, because then `data` would take up too much memory. I can (and in fact do) find middle ground, by preparing several iterations of `data` and only then write it to file after a specified number of iterations (but I have removed that part for brevity). Ok, so we can conclude that collective IO routines are preferred than non-collective, but how about `_all` vs `_ordered`. Would that help even more? — user3452579, Mar 20 '16 at 11:23
I honestly never tried to use a MPI shared file pointer (although I certainly did quite a lot of MPI-IO). So IDK how well it would behave. I'd say that considering the "shared" and "ordered" aspect of it, I would suspect the complexity of maintaining the coherent view within the library might just come in the way of performance, but I might be completely wrong about that... Still, this wouldn't be (and AAMOF never has been) my preferred choice. Just give it a try and see what it gives you. — Gilles, Mar 21 '16 at 08:30

Optimize writing to shared file with MPI

0 Answers0