I was inspired by Святослав Павленко's answer: using the blocking MPI communications to enforce serial-in-time output. While Wesley Bland has a point about MPI not being built for serial output. So if we want to output data, it makes sense either have each processor output (non-colliding) data. Alternatively, if the order of the data is important (and it's not too big) the recommended approach is to send it all to on cpu (say rank 0), which then formats the data correctly.
To me, this seems to be a bit of overkill especially when the data can be variable-length strings, which all too often is what std::cout << "a=" << some_varible << " b=" << some_other_variable
often is. So if we want some quick-and-dirty in-order printing, we can exploit Святослав Павленко's answer to build a serial output stream. This solution works fine, but its performance scales badly with many cpus, so don't use it of data output!
#include <iostream>
#include <sstream>
#include <mpi.h>
MPI House-keeping:
int mpi_size;
int mpi_rank;
void init_mpi(int argc, char * argv[]) {
MPI_Init(& argc, & argv);
MPI_Comm_size(MPI_COMM_WORLD, & mpi_size);
MPI_Comm_rank(MPI_COMM_WORLD, & mpi_rank);
}
void finalize_mpi() {
MPI_Finalize();
}
General-purpose class which enables MPI message-chaining
template<class T, MPI_Datatype MPI_T> class MPIChain{
// Uses a chained MPI message (T) to coordinate serial execution of code (the content of the message is irrelevant).
private:
T message_out; // The messages aren't really used here
T message_in;
int size;
int rank;
public:
void next(){
// Send message to next core (if there is one)
if(rank + 1 < size) {
// MPI_Send - Performs a standard-mode blocking send.
MPI_Send(& message_out, 1, MPI_T, rank + 1, 0, MPI_COMM_WORLD);
}
}
void wait(int & msg_count) {
// Waits for message to arrive. Message is well-formed if msg_count = 1
MPI_Status status;
// MPI_Probe - Blocking test for a message.
MPI_Probe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, & status);
// MPI_Get_count - Gets the number of top level elements.
MPI_Get_count(& status, MPI_T, & msg_count);
if(msg_count == 1) {
// MPI_Recv - Performs a standard-mode blocking receive.
MPI_Recv(& message_in, msg_count, MPI_T, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, & status);
}
}
MPIChain(T message_init, int c_rank, int c_size): message_out(message_init), size(c_size), rank(c_rank) {}
int get_rank() const { return rank;}
int get_size() const { return size;}
};
We can now use our MPIChain
class to create our class which manages to output stream:
class ChainStream : public MPIChain<int, MPI_INT> {
// Uses the MPIChain class to implement a ostream with a serial operator<< implementation.
private:
std::ostream & s_out;
public:
ChainStream(std::ostream & os, int c_rank, int c_size)
: MPIChain<int, MPI_INT>(0, c_rank, c_size), s_out(os) {};
ChainStream & operator<<(const std::string & os){
if(this->get_rank() == 0) {
this->s_out << os;
// Initiate chain of MPI messages
this->next();
} else {
int msg_count;
// Wait untill a message arrives (MPIChain::wait uses a blocking test)
this->wait(msg_count);
if(msg_count == 1) {
// If the message is well-formed (i.e. only one message is recieved): output string
this->s_out << os;
// Pass onto the next member of the chain (if there is one)
this->next();
}
}
// Ensure that the chain is resolved before returning the stream
MPI_Barrier(MPI_COMM_WORLD);
// Don't output the ostream! That would break the serial-in-time exuction.
return *this;
};
};
Note the MPI_Barrier
at the end of operator<<
. This is to prevent the code starting a second output chain. Even though this could be moved outside the operator<<
, I figured that I would put it here, since this is supposed to be serial output anyway....
Putting it all together:
int main(int argc, char * argv[]) {
init_mpi(argc, argv);
ChainStream cs(std::cout, mpi_rank, mpi_size);
std::stringstream str_1, str_2, str_3;
str_1 << "FIRST: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
str_2 << "SECOND: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
str_3 << "THIRD: " << "MPI_SIZE = " << mpi_size << " RANK = " << mpi_rank << std::endl;
cs << str_1.str() << str_2.str() << str_3.str();
// Equivalent to:
//cs << str_1.str();
//cs << str_2.str();
//cs << str_3.str();
finalize_mpi();
}
Note that we are concatenating the strings str_1
, str_2
, str_3
before we send them the the ChainStream
instance. Normally one would do something like:
std::cout << "a" << "b" << "c"" << std::endl
but this applies operator<<
from left-to-right, and we want the strings to be ready for output before sequentially running through each process.
g++-7 -O3 -lmpi serial_io_obj.cpp -o serial_io_obj
mpirun -n 10 ./serial_io_obj
Outputs:
FIRST: MPI_SIZE = 10 RANK = 0
FIRST: MPI_SIZE = 10 RANK = 1
FIRST: MPI_SIZE = 10 RANK = 2
FIRST: MPI_SIZE = 10 RANK = 3
FIRST: MPI_SIZE = 10 RANK = 4
FIRST: MPI_SIZE = 10 RANK = 5
FIRST: MPI_SIZE = 10 RANK = 6
FIRST: MPI_SIZE = 10 RANK = 7
FIRST: MPI_SIZE = 10 RANK = 8
FIRST: MPI_SIZE = 10 RANK = 9
SECOND: MPI_SIZE = 10 RANK = 0
SECOND: MPI_SIZE = 10 RANK = 1
SECOND: MPI_SIZE = 10 RANK = 2
SECOND: MPI_SIZE = 10 RANK = 3
SECOND: MPI_SIZE = 10 RANK = 4
SECOND: MPI_SIZE = 10 RANK = 5
SECOND: MPI_SIZE = 10 RANK = 6
SECOND: MPI_SIZE = 10 RANK = 7
SECOND: MPI_SIZE = 10 RANK = 8
SECOND: MPI_SIZE = 10 RANK = 9
THIRD: MPI_SIZE = 10 RANK = 0
THIRD: MPI_SIZE = 10 RANK = 1
THIRD: MPI_SIZE = 10 RANK = 2
THIRD: MPI_SIZE = 10 RANK = 3
THIRD: MPI_SIZE = 10 RANK = 4
THIRD: MPI_SIZE = 10 RANK = 5
THIRD: MPI_SIZE = 10 RANK = 6
THIRD: MPI_SIZE = 10 RANK = 7
THIRD: MPI_SIZE = 10 RANK = 8
THIRD: MPI_SIZE = 10 RANK = 9