1

I have a master application which spawns a worker which itself spawns two slaves. The slave application writes its output to stdout. My idea was to bind stdout to a different stream in the worker application for being able to store the output of the slaves in a variable and send it over to the master, which handles the output. However, stdout of the slaves does not get redirected properly and still appears on the console. The buffer in the worker application stays empty. Am I missing something or is this not possible in the way I do it? If so, any recommendations on how to handle this issue in a different manner a greatly appreciated. I'm using Open MPI 1.6.5 on Gentoo and here's the source code of my applications:

master.cpp

#include <mpi.h>
#include <iostream>

using namespace std;

int main(int argc, char *argv[])
{
    char appExe[] = "worker";
    char *appArg[] = {NULL};
    int maxProcs = 1;
    int myRank; 
    MPI_Comm childComm;
    int spawnError;

    // Initialize
    MPI_Init(&argc, &argv);

    // Rank 
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

    // Spawn application    
    MPI_Comm_spawn(appExe, appArg, maxProcs, MPI_INFO_NULL, myRank, MPI_COMM_SELF, &childComm, &spawnError);

    // Receive length of message from worker
    int len;
    MPI_Recv(&len, 1, MPI_INT, 0, MPI_ANY_TAG, childComm, MPI_STATUS_IGNORE);
    // Receive actual message from worker
    char *buf = new char[len];
    MPI_Recv(buf, len, MPI_CHAR, 0, MPI_ANY_TAG, childComm, MPI_STATUS_IGNORE);
    cout << "master: Got the following from worker: " << buf << endl;

    // Finalize
    MPI_Finalize();

    return 0;
}

worker.cpp

#include "mpi.h"
#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main(int argc, char *argv[])
{
    char appExe[] = "slave";
    char *appArg[] = {NULL};
    int maxProcs = 2;
    int myRank, parentRank; 
    MPI_Comm childComm, parentComm;
    int spawnError[maxProcs];

    // Initialize
    MPI_Init(&argc, &argv);

    // Rank
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

    // Get parent
    MPI_Comm_get_parent(&parentComm);

    // Bind stdout to new_buffer
    stringstream new_buffer;
    streambuf *old_buffer = cout.rdbuf(new_buffer.rdbuf());  

    // Spawn application    
    MPI_Comm_spawn(appExe, appArg, maxProcs, MPI_INFO_NULL, myRank, MPI_COMM_SELF, &childComm, spawnError);

    // Enter barrier
    MPI_Barrier(childComm);

    // Reset stdout to old_buffer
    cout.rdbuf(old_buffer);

    // Make a string
    string tmp = new_buffer.str();
    // Make a character array from string
    const char* cstr = tmp.c_str();
    cout << "worker: Got the following from slaves: " << cstr << endl;

    // Send length of message to master   
    int len = sizeof(cstr);
    MPI_Send(&len, 1, MPI_INT, 0, 0, parentComm);
    // Send actual message
    MPI_Send(&cstr, len, MPI_CHAR, 0, 0, parentComm);

    // Finalize
    MPI_Finalize();

    return 0;
}

slave.cpp

#include <mpi.h>
#include <iostream>

using namespace std;

int main(int argc, char *argv[])
{
    MPI_Comm parent;

    // Initialize
    MPI_Init(&argc, &argv);

    // Get parent
    MPI_Comm_get_parent(&parent);

    // Say hello
    cout << "slave: Hi there!" << endl;

    // Enter barrier
    if (parent != MPI_COMM_NULL)
        MPI_Barrier(parent);

    // Finalize
    MPI_Finalize();

    return 0;
}
Marcel
  • 616
  • 1
  • 5
  • 15

2 Answers2

1

Job spawning in MPI happens in the same "universe" and is usually performed by the same application launcher that is being used to launch the initial MPI job. In Open MPI that would be orterun (mpiexec and mpirun are both symlinks to orterun). I/O redirection is performed by the ORTE (the Open MPI run-time environment, part of the MPI library) and it sends the standard output of each MPI process to orterun, which then mixes everything and displays it to its console output or saves it to a file if output redirection is in place. Unless a spawned job specifically writes its output to a file, the parent has no way to intercept that output.

The other (and only) MPI-compliant way to communicate between the parent job and the spawned jobs is to use MPI message passing. You can implement your own C++ input and output stream classes that use MPI messages to transmit data over the intercommunicator.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • Ok, I just wanted to make sure that there is no way for the parent to intercept the output of the spawned children. – Marcel Aug 13 '13 at 12:08
0

In general, using stdout/stderr for important output in a distributed application isn't the right way to go. It's difficult to enforce useful ordering which causes the lines to get jumbled together sometimes. It's usually far more effective to read/write data to files that can be moved around via NFS or some script. Then you know that the ordering is correct.

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
  • Actually, my slave application is a huge third-party code and I do not have any influence on changing the output commands there. Is it possible to pass an argument to MPI_Comm_spawn such that the output of an application is redirected to a file? – Marcel Aug 09 '13 at 12:46
  • My master application implements a coupling procedure which communicates to different worker instances which then call their slaves. The worker instances are called multiple times but in a fixed order from the master, so I thought it would be the more natural solution to bind stdout to a different stream. If I handle all output via files I will have to keep track of the line numbers where the slaves stopped to write their last output and extract the output from the current iteration, i.e., opening, reading and closing a file each time a worker is called. Isn't there are more elegant solution? – Marcel Aug 09 '13 at 13:03