Summary:
Augment your parallel program so you can pass either the terminal size as say rows=
and cols=
command line parameters, or a path to a terminal as say tty=
command line parameter.
example.c:
// SPDX-License-Identifier: CC0-1.0
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
static int rows = 0;
static int cols = 0;
int main(int argc, char *argv[])
{
/*
* Beginning of added code
*/
if (isatty(STDERR_FILENO)) {
struct winsize w;
if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &w) == 0 && w.ws_row > 0 && w.ws_col > 0) {
rows = w.ws_row;
cols = w.ws_col;
}
}
for (int arg = 1; arg < argc; arg++) {
if (!strncmp(argv[arg], "tty=", 4)) {
const int fd = open(argv[arg] + 4, O_RDWR | O_NOCTTY);
if (fd != -1) {
struct winsize w = { .ws_row = 0, .ws_col = 0 };
if (ioctl(fd, TIOCGWINSZ, &w) == 0 && w.ws_row > 0 && w.ws_col > 0) {
rows = w.ws_row;
cols = w.ws_col;
if (isatty(STDOUT_FILENO)) {
ioctl(STDOUT_FILENO, TIOCSWINSZ, &w);
}
}
close(fd);
}
} else
if (!strncmp(argv[arg], "rows=", 5)) {
int val;
char dummy;
if (sscanf(argv[arg] + 5, "%d %c", &val, &dummy) == 2 && val > 0) {
rows = val;
if (isatty(STDOUT_FILENO) && rows > 0 && cols > 0) {
struct winsize w = { .ws_row = rows, .ws_col = cols };
ioctl(STDOUT_FILENO, TIOCSWINSZ, &w);
}
}
} else
if (!strncmp(argv[arg], "cols=", 5)) {
int val;
char dummy;
if (sscanf(argv[arg] + 5, "%d %c", &val, &dummy) == 2 && val > 0) {
cols = val;
if (isatty(STDOUT_FILENO) && rows > 0 && cols > 0) {
struct winsize w = { .ws_row = rows, .ws_col = cols };
ioctl(STDOUT_FILENO, TIOCSWINSZ, &w);
}
}
}
}
/*
* End of added code
*/
if (cols > 0 && rows > 0) {
printf("Assuming terminal has %d columns and %d rows.\n", cols, rows);
} else {
printf("No assumptions about terminal size.\n");
}
if (isatty(STDOUT_FILENO)) {
printf("Standard output is terminal %s", ttyname(STDOUT_FILENO));
struct winsize w;
if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &w) == 0) {
printf(" reporting %d columns and %d rows.\n", w.ws_col, w.ws_row);
} else {
printf(".\n");
}
}
sleep(5);
return EXIT_SUCCESS;
}
Compile and run using for example
mpicc -Wall -O2 example.c -o example
mpirun -n 1 ./example tty=$(tty)
mpirun -n 1 ./example rows=$(tput lines) cols=$(tput cols)
The above program has each parallel process (only one for above, -n 2
and more will work fine) report what it thinks it is outputting to.
The tty=$(tty)
form is nice shorthand when all processes run on this node (this same computer, this current OS). The rows=$(tput lines) cols=$(tput cols)
records the current terminal dimensions into parameters without any reference to a particular tty device, and will work even if the parallel processes are scattered over numerous machines.
Description of the situation:
mpirun
is one of those binaries that can be provided by many different packages. Run realpath $(which mpirun)
to see the actual binary it currently refers to on your system. On Debian-based systems, you can use dpkg-query -S $(realpath $(which mpirun))
to find which package currently provides it. I will assume it is either openmpi, or compatible to openmpi.
openmpi-bin version 2.1.1-8 creates a pseudoterminal for the standard output stream for every parallel process, but it sets their row and column counts to zero. Although it forwards the contents of these pseudoterminals to the output of the parent process, it does not handle or forward window size change notification (SIGWINCH signal). It also does not respond to queries like Device Status Report ("\033[6n"
).
In essence, openmpi's mpirun (orterun) provides a pseudoterminal you can use, but it doesn't "maintain" it properly; just treats it as a fancy pipe. Not a problem, really, but does cause the side effects OP and others are observing.
Because of a number of practical and historical reasons, and this being MPI, it is safe to assume ANSI escape codes (CSI control sequences and extensions, for example cursor movement, color, clearing lines, and so on) work. No, there are no quarantees they do, I just haven't ever run MPI programs in parallel in an environment where they didn't. If the output goes to a file, they can be either "played back" by emitting the file to a xterm-like terminal emulator, or ripped out using a few carefully crafted sed
expressions.
Suggested solutions:
Run each parallel process in a separate xterm window using
mpirun -n 1 -xterm -1 ./example
Because each parallel process is connected to its own xterm instance, they will correctly respond to window size changes. And you can assume that xterm escape sequences (including most ANSI escape sequences, like colors, cursor movement, scrolling etc.) will work, even if you don't use curses or terminfo at all.
Run each parallel process using your own pseudoterminal or shim
using PATH=$PWD:$PATH mpirun -n 1 -xterm -1 ./example
mpirun
uses the first xterm
executable in your PATH, executing it as xterm -T WindowTitle -e command args
. In the above case, if there is an executable called xterm
in the current directory, it will be used.
If you write your own shim or launcher script, you can ignore all command line arguments up to and including -e
. Anything left is the command to be executed (your parallel process on that instance.)
You can use this to use a terminal emulator you prefer over xterm, probably even purely text-based ones like screen
, or you can connect the standard output (and even standard input) to your own pseudoterminal multiplexer. Then again, you could just integrate such multiplexor magic to your own MPI program directly.
The xterm
shim runs exactly the same way as your parallel processes: zero rank with standard input connected to a pipe that connects to the standard input of the mpirun command, and all other ranks with standard input connected to /dev/null; standard output connected to a pseudoterminal (one per parallel process) maintained by the mpirun parent process; and standard error connected via another pipe that connectes to the standard error of the mpirun command.
(If you really dislike xterm, but have a favourite one, let me know in a comment and I'll show a really simple Bash scriptlet named as xterm
in the same directory of your own program, that you can use to run your program in parallel with each one having their own terminal window.)
Pass the information of the expected terminal used for the output, to each parallel process via command-line arguments.
example.c above supports two methods: either via rows=
and cols=
command-line arguments specifying it directly, or by specifying the terminal that will be used to display the result, tty=
. The form tty=$(tty)
uses the tty
utility to emit the path to the controlling terminal of that session, so is the closest thing to "this terminal" you can pass, but only works if the parallel processes run on this same machine and same OS instance.
The former is useful if you have the processes distributed across several nodes (computers), because it literally only passes the terminal size and nothing else.
Because mpirun does not pass window size change events (SIGWINCH signals) to the parallel processes, the parallel processes are stuck with the original terminal size, and if that terminal size changes, mpirun won't bother to tell the parallel processes.
That doesn't mean your parallel processes cannot use say an MPI broadcast communicator to update their common understanding of the output terminal dimensions. You could even wire up the SIGWINCH signal handler, so that user-originatin SIGWINCH signals with a single long
as payload (sent via sigqueue()) extracts the new window size and broadcasts it to all parallel processes. Then, you can run a simple background program on the same terminal, which reflects any SIGWINCH signals to all local children of mpirun processes running as the same user as you are (my-signal-reflector ./program & mpirun -n 1 ./program ; wait
).
Most of the other schemes that immediately come to my mind can be folded into one of these three, more or less.
If you are running these over an SSH connection, consider running the mpirun
command in a screen
session, setting the virtual screen terminal size (using the screen width <cols> <rows>
command), and passing those dimensions as a command-line parameters similar to cols=
and rows=
in example.c. That way you can leave the parallel processes running, but detach from the mpirun sessions; even close the SSH connection, and come back to it later. When you connect to that screen session, if your actual window has different size, output doesn't get garbled, you only see part of the screen session. Overhead is minimal, and if users are allowed to run MPI sessions interactively (do ask your sysadmin; do not just assume that it is okay to do so on the cluster front-end, as usually there is a separate "not-that-front-end" for interactive sessions, so that frontend can be reserved for compilation and data/result transfers, and not bogged down by superfluous parallel processes being run interactively), doing so exactly like this under screen
is often warmly recommended.