2

I wanted to use boost::mpi communicators in my class, because I want my class to take care of all MPI calls. I used this style to make them static members of my class.

// works.cpp
// mpic++ -o works works.cpp -lboost_mpi
#include <boost/mpi.hpp>
#include <iostream>

class Example {
  static boost::mpi::environment env;
  static boost::mpi::communicator world; 
 public:
  Example() {
    std::cout << world.rank() << std::endl;
  }
};

boost::mpi::environment Example::env;
boost::mpi::communicator Example::world;

int main() {
  auto e = Example();
}

This works just fine, e.g., mpirun -n 4 ./works prints the numbers 0 to 3, as expected. I later wanted to template my class. I was at first worried about how to initialize my static member variables, but read this answer, which suggested it was all fine

// fails.cpp
// mpic++ -o fails fails.cpp -lboost_mpi
#include <boost/mpi.hpp>
#include <iostream>

template<typename T>
class Example {
  static boost::mpi::environment env;
  static boost::mpi::communicator world; 
 public:
  Example() {
    std::cout << world.rank() << std::endl;
  }
};

template <typename T>
boost::mpi::environment Example<T>::env;

template <typename T>
boost::mpi::communicator Example<T>::world;

int main() {
  auto e = Example<double>();
}

However, this in fact gives me

$ mpirun -n 4 ./fails
*** The MPI_Comm_rank() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
...

Something is going wrong here, but what? and why? I think I am misunderstanding a few things here.

innisfree
  • 2,044
  • 1
  • 14
  • 24

3 Answers3

2

Any particular reason you need a boost::mpi::environment instance in your classes? It is just a horrible OO wrapper around the currently singleton nature of MPI - you may initialise MPI only once by calling MPI_Init() and finalise it only once afterwards by calling MPI_Finalize() (at least, until the MPI sessions make it into the future versions of the standard). If you try to do anything related to MPI before MPI_Init(), you get an error (except for a handful of information query calls that work). If you try to do anything related to MPI after MPI_Finalize(), you get an error. If a single rank exits without calling MPI_Finalize() after calling MPI_Init(), the entire MPI job usually crashes as the launcher assumes something wrong has happened and kills the rest of the ranks.

The sole purpose of the boost::mpi::environment instance is to ensure that MPI was initialised and not left un-finalised and all that is controlled by the lifetime of the object. Therefore, you have to place an instance of it where that instance will predate and outlive any MPI activity in your program, and the best place for that is the main() function. Besides, although MPI-2 implementations are prevalent nowadays and you can safely use the parameterless variant of environment's constructor, it's better to use the one that takes argc and argv from main() for compatibility sake. Unless you are writing a library, of course.

Note that environment objects check if MPI was already initialised by the time they are created and if so, they won't finalise it when destroyed, but they will finalise it if they were to one to initialise it. So, you should not have two instances with non-overlapping lifetimes. Therefore, you have to ensure proper constructor and destructor call order when you deal with global instances of the class. So, keep it simple, and just create one instance in the main() function.

This is not the case with boost::mpi::communicator. It's default constructor merely wraps the MPI_COMM_WORLD communicator handle, which is a run-time constant referring to the singleton MPI world communicator. Since the wrapping mode of the default constructor is comm_attach, the instance will not try to free MPI_COMM_WORLD upon destruction, therefore you may have as many of those as you like and in any part of your code. Just keep in mind that most of the instance methods only work after initialisation and before finalisation of the MPI environment, which is what defines the lifetime of the actual world communicator object (the one in the MPI implementation, not the instance of boost::mpi::communicator).

You don't even need a static instance of communicator as along as you only need it for access to the world communicator. You are only saving one call to new and one to delete.

I would write your code simply like this:

#include <boost/mpi.hpp>
#include <iostream>

template<typename T>
class Example {
  boost::mpi::communicator world; 
public:
  Example() {
    std::cout << world.rank() << std::endl;
  }
};

int main(int argc, char **argv) {
  boost::mpi::environment(argc, argv);
  auto e = Example<double>();
}

It works as expected:

$ mpic++ -o works works.cc -lboost_mpi
$ mpiexec -n 4 ./works
2
0
3
1
Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • I agree. I s it only the env that is problematic? Constructing communicators elsewhere is OK? – innisfree May 07 '20 at 11:51
  • @innisfree There is a singleton world communicator `MPI_COMM_WORLD`, but it is not constructed by the `communicator` class - it is merely wrapped when the default constructor is called. Which means you cannot construct such objects before MPI was initialised since that's when `MPI_COMM_WORLD` gets its actual value (at least with Open MPI; it has a special constant value in MPICH-based implementations). – Hristo Iliev May 07 '20 at 11:59
  • @innisfree I stand corrected in regard to Open MPI - the `MPI_COMM_WORLD` is the address of a static structure, so its value won't change after initialisation. And indeed, the standard guarantees that the handles are constants. – Hristo Iliev May 07 '20 at 12:24
1

Both versions of your program cause undefined behaviour as told in boost::mpi documentation. The fact that first works is apparently an accident. What is wrong? Exactly this:

Declaring an mpi::environment at global scope is undefined behavior.

So in short, mpi::environment should be created at the start of main.

bartop
  • 9,971
  • 1
  • 23
  • 54
  • ah, so I should interpret global scope to include my private static member variable? – innisfree May 07 '20 at 08:49
  • It technically is not global scope, but since practical differences between `static` members and global variables are almost none, I would interprete it that way. Especially if it is told that `environment` uses `argc` and `argv` – bartop May 07 '20 at 08:59
  • Yeah, that makes sense, thanks. Any thoughts on the (partial) answer I've just written to myself would be appreciated. E.g., is the behaviour of that code still undefined? – innisfree May 07 '20 at 09:12
  • @bartop, the documentation is misleading. It doesn't use `argc` and `argv` unless the overloaded constructor that takes them is called. That has to do with pre-MPI-2 implementations where passing `NULL` to `MPI_Init()` is not allowed. – Hristo Iliev May 07 '20 at 12:51
0

Well, following this answer, it's possible by making separate instances of world and env inside main, but whether it's advisable, I don't know. E.g., this appears to work as expected:

#include <boost/mpi.hpp>
#include <iostream>

template<typename T>
class Example {
  static boost::mpi::environment env;
  static boost::mpi::communicator world; 
 public:
  Example() {
    std::cout << world.rank() << std::endl;
  }
};

template <typename T>
boost::mpi::environment Example<T>::env;

template <typename T>
boost::mpi::communicator Example<T>::world;

int main() {
  boost::mpi::environment env_in_main;
  boost::mpi::communicator world_in_main; 
  auto e = Example<double>();
}

Actually, though, most functionality can be achieved using the communicator world. There's little need to make env a class member, so perhaps the best solution is to keep world as a class member, but not env.

innisfree
  • 2,044
  • 1
  • 14
  • 24
  • I'd say it's still UB or at least usage of undocumented features which all in all can cause UB easily. I'd stick to what is documented – bartop May 07 '20 at 09:28