2

Getting the terminal width from a C/C++ program was already explained extensively on SO e.g. here.

However, when launching the same code with MPI it returns 0 for the number of lines and columns (let foo.c be the source file containing the code from the answer above):

# gcc foo.c
# mpirun -n 1 ./a.out
lines 0
columns 0

Is there a way to get the width of the terminal that is connected to the stdout of MPI?

I would like to avoid dragging in an extra library like ncurses for this. Furthermore, I'm primarily interested in answers for Linux.

Seriously
  • 884
  • 1
  • 11
  • 25
  • Did you read the comments below the linked answer? Specifically [troglobit](https://stackoverflow.com/users/1708249/troglobit)'s [comment](https://stackoverflow.com/questions/1022957/getting-terminal-width-in-c/1022961#comment115662358_1022961) mentions a few fallback mechanisms. Is `stdout` a TTY in your use case? I suggest some error handling: `if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &w) < 0) perror("ioctl");` – Bodo Jul 02 '21 at 11:56
  • Generally speaking, you cannot assume (rank 0 of) a MPI application is connected to a tty. – Gilles Gouaillardet Jul 02 '21 at 13:05
  • @Bodo Yes I did. I tested `stdout`, `stderr`, and `stdin`. None of them worked from the MPI context which is why I posted this question. – Seriously Jul 02 '21 at 13:53
  • @GillesGouaillardet True, however, I am not interested in a general case, just the case I presented in the question where the output obviously ends up on a `tty`. However, it seems not to be directly connected to the rank otherwise the linked method would work. – Seriously Jul 02 '21 at 13:56
  • @Seriously The comment does not suggest to try other file descriptors. Citation: "... `getenv("COLUMNS")` works perfectly when running under watch(1). So now I have a set of fallbacks, all from the `TIOCWINSZ` ioctl, to *getenv if not a tty*, down to the classic ANSI escape move cursor to 9999,9999 and then query cursor pos..." Please add the error handling for `ioctl` and either show the resulting output in your question or state that you don't get an error message. – Bodo Jul 02 '21 at 13:59
  • I just wanted to edit my comment that I also tested `getEnv("COLUMNS")` and it did not work :D . As you surely saw in the [comment above the one you linked](https://stackoverflow.com/questions/1022957/getting-terminal-width-in-c/1022961#comment100591763_1022961) Alexis correctly mentions that these environment variables are not exported. – Seriously Jul 02 '21 at 14:03

2 Answers2

1

Summary:

Augment your parallel program so you can pass either the terminal size as say rows= and cols= command line parameters, or a path to a terminal as say tty= command line parameter.

example.c:

// SPDX-License-Identifier: CC0-1.0
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>

static int  rows = 0;
static int  cols = 0;

int main(int argc, char *argv[])
{
    /*
     * Beginning of added code
    */
    if (isatty(STDERR_FILENO)) {
        struct winsize  w;
        if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &w) == 0 && w.ws_row > 0 && w.ws_col > 0) {
            rows = w.ws_row;
            cols = w.ws_col;
        }
    }
    for (int arg = 1; arg < argc; arg++) {
        if (!strncmp(argv[arg], "tty=", 4)) {
            const int  fd = open(argv[arg] + 4, O_RDWR | O_NOCTTY);
            if (fd != -1) {
                struct winsize  w = { .ws_row = 0, .ws_col = 0 };
                if (ioctl(fd, TIOCGWINSZ, &w) == 0 && w.ws_row > 0 && w.ws_col > 0) {
                    rows = w.ws_row;
                    cols = w.ws_col;
                    if (isatty(STDOUT_FILENO)) {
                        ioctl(STDOUT_FILENO, TIOCSWINSZ, &w);
                    }
                }
                close(fd);
            }
        } else
        if (!strncmp(argv[arg], "rows=", 5)) {
            int  val;
            char dummy;
            if (sscanf(argv[arg] + 5, "%d %c", &val, &dummy) == 2 && val > 0) {
                rows = val;
                if (isatty(STDOUT_FILENO) && rows > 0 && cols > 0) {
                    struct winsize  w = { .ws_row = rows, .ws_col = cols };
                    ioctl(STDOUT_FILENO, TIOCSWINSZ, &w);
                }
            }
        } else
        if (!strncmp(argv[arg], "cols=", 5)) {
            int  val;
            char dummy;
            if (sscanf(argv[arg] + 5, "%d %c", &val, &dummy) == 2 && val > 0) {
                cols = val;
                if (isatty(STDOUT_FILENO) && rows > 0 && cols > 0) {
                    struct winsize  w = { .ws_row = rows, .ws_col = cols };
                    ioctl(STDOUT_FILENO, TIOCSWINSZ, &w);
                }
            }
        }
    }
    /*
     * End of added code
    */

    if (cols > 0 && rows > 0) {
        printf("Assuming terminal has %d columns and %d rows.\n", cols, rows);
    } else {
        printf("No assumptions about terminal size.\n");
    }

    if (isatty(STDOUT_FILENO)) {
        printf("Standard output is terminal %s", ttyname(STDOUT_FILENO));

        struct winsize  w;
        if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &w) == 0) {
            printf(" reporting %d columns and %d rows.\n", w.ws_col, w.ws_row);
        } else {
            printf(".\n");
        }
    }

    sleep(5);

    return EXIT_SUCCESS;
}

Compile and run using for example

mpicc -Wall -O2 example.c -o example
mpirun -n 1 ./example tty=$(tty)
mpirun -n 1 ./example rows=$(tput lines) cols=$(tput cols)

The above program has each parallel process (only one for above, -n 2 and more will work fine) report what it thinks it is outputting to.

The tty=$(tty) form is nice shorthand when all processes run on this node (this same computer, this current OS). The rows=$(tput lines) cols=$(tput cols) records the current terminal dimensions into parameters without any reference to a particular tty device, and will work even if the parallel processes are scattered over numerous machines.


Description of the situation:

mpirun is one of those binaries that can be provided by many different packages. Run realpath $(which mpirun) to see the actual binary it currently refers to on your system. On Debian-based systems, you can use dpkg-query -S $(realpath $(which mpirun)) to find which package currently provides it. I will assume it is either openmpi, or compatible to openmpi.

openmpi-bin version 2.1.1-8 creates a pseudoterminal for the standard output stream for every parallel process, but it sets their row and column counts to zero. Although it forwards the contents of these pseudoterminals to the output of the parent process, it does not handle or forward window size change notification (SIGWINCH signal). It also does not respond to queries like Device Status Report ("\033[6n").

In essence, openmpi's mpirun (orterun) provides a pseudoterminal you can use, but it doesn't "maintain" it properly; just treats it as a fancy pipe. Not a problem, really, but does cause the side effects OP and others are observing.

Because of a number of practical and historical reasons, and this being MPI, it is safe to assume ANSI escape codes (CSI control sequences and extensions, for example cursor movement, color, clearing lines, and so on) work. No, there are no quarantees they do, I just haven't ever run MPI programs in parallel in an environment where they didn't. If the output goes to a file, they can be either "played back" by emitting the file to a xterm-like terminal emulator, or ripped out using a few carefully crafted sed expressions.


Suggested solutions:

  1. Run each parallel process in a separate xterm window using mpirun -n 1 -xterm -1 ./example

    Because each parallel process is connected to its own xterm instance, they will correctly respond to window size changes. And you can assume that xterm escape sequences (including most ANSI escape sequences, like colors, cursor movement, scrolling etc.) will work, even if you don't use curses or terminfo at all.

  2. Run each parallel process using your own pseudoterminal or shim using PATH=$PWD:$PATH mpirun -n 1 -xterm -1 ./example

    mpirun uses the first xterm executable in your PATH, executing it as xterm -T WindowTitle -e command args. In the above case, if there is an executable called xterm in the current directory, it will be used.

    If you write your own shim or launcher script, you can ignore all command line arguments up to and including -e. Anything left is the command to be executed (your parallel process on that instance.)

    You can use this to use a terminal emulator you prefer over xterm, probably even purely text-based ones like screen, or you can connect the standard output (and even standard input) to your own pseudoterminal multiplexer. Then again, you could just integrate such multiplexor magic to your own MPI program directly.

    The xterm shim runs exactly the same way as your parallel processes: zero rank with standard input connected to a pipe that connects to the standard input of the mpirun command, and all other ranks with standard input connected to /dev/null; standard output connected to a pseudoterminal (one per parallel process) maintained by the mpirun parent process; and standard error connected via another pipe that connectes to the standard error of the mpirun command.

    (If you really dislike xterm, but have a favourite one, let me know in a comment and I'll show a really simple Bash scriptlet named as xterm in the same directory of your own program, that you can use to run your program in parallel with each one having their own terminal window.)

  3. Pass the information of the expected terminal used for the output, to each parallel process via command-line arguments.

    example.c above supports two methods: either via rows= and cols= command-line arguments specifying it directly, or by specifying the terminal that will be used to display the result, tty=. The form tty=$(tty) uses the tty utility to emit the path to the controlling terminal of that session, so is the closest thing to "this terminal" you can pass, but only works if the parallel processes run on this same machine and same OS instance.

    The former is useful if you have the processes distributed across several nodes (computers), because it literally only passes the terminal size and nothing else.

    Because mpirun does not pass window size change events (SIGWINCH signals) to the parallel processes, the parallel processes are stuck with the original terminal size, and if that terminal size changes, mpirun won't bother to tell the parallel processes.

    That doesn't mean your parallel processes cannot use say an MPI broadcast communicator to update their common understanding of the output terminal dimensions. You could even wire up the SIGWINCH signal handler, so that user-originatin SIGWINCH signals with a single long as payload (sent via sigqueue()) extracts the new window size and broadcasts it to all parallel processes. Then, you can run a simple background program on the same terminal, which reflects any SIGWINCH signals to all local children of mpirun processes running as the same user as you are (my-signal-reflector ./program & mpirun -n 1 ./program ; wait).

Most of the other schemes that immediately come to my mind can be folded into one of these three, more or less.

If you are running these over an SSH connection, consider running the mpirun command in a screen session, setting the virtual screen terminal size (using the screen width <cols> <rows> command), and passing those dimensions as a command-line parameters similar to cols= and rows= in example.c. That way you can leave the parallel processes running, but detach from the mpirun sessions; even close the SSH connection, and come back to it later. When you connect to that screen session, if your actual window has different size, output doesn't get garbled, you only see part of the screen session. Overhead is minimal, and if users are allowed to run MPI sessions interactively (do ask your sysadmin; do not just assume that it is okay to do so on the cluster front-end, as usually there is a separate "not-that-front-end" for interactive sessions, so that frontend can be reserved for compilation and data/result transfers, and not bogged down by superfluous parallel processes being run interactively), doing so exactly like this under screen is often warmly recommended.

  • Ok so in short: It's not possible to access the directly connected terminal. Workarounds: `-xterm` flag or manually passing the info. The bit about OpenMPI using a pseudo-terminal is interesting. Any idea how other MPI implementations handle this? E.g. Intel or MPICH? – Seriously Jul 04 '21 at 22:43
  • @Seriously: I don't have access to those right now; I've been using OpenMPI exclusively for a few years now (not HPC anymore, just home-hobby stuff). – Blabbo the Verbose Jul 05 '21 at 23:31
0

If you are only interested in one specific case, you should clearly describe it.

MPI vendor and version is a minimum.

Anyway, if you are using Open MPI v4.0.x, you can apply the patch below and rebuild (tested on the latest v4.0.x branch from Open MPI repository.

diff --git a/orte/mca/iof/base/iof_base_setup.c b/orte/mca/iof/base/iof_base_setup.c
index 01fda21..29786d0 100644
--- a/orte/mca/iof/base/iof_base_setup.c
+++ b/orte/mca/iof/base/iof_base_setup.c
@@ -93,8 +93,14 @@ orte_iof_base_setup_prefork(orte_iof_base_io_conf_t *opts)
          * pty exactly as we use the pipes.
          * This comment is here as a reminder.
          */
+        struct winsize w, *wp;
+        if (0 > ioctl(STDOUT_FILENO, TIOCGWINSZ, &w)) {
+            wp = NULL;
+        } else {
+            wp = &w;
+        }
         ret = opal_openpty(&(opts->p_stdout[0]), &(opts->p_stdout[1]),
-                           (char*)NULL, (struct termios*)NULL, (struct winsize*)NULL);
+                           (char*)NULL, (struct termios*)NULL, wp);
     }
 #else
     opts->usepty = 0;
Gilles Gouaillardet
  • 8,193
  • 11
  • 24
  • 30
  • Hey, thanks for fixing this [upstream](https://github.com/openpmix/prrte/blob/master/src/mca/iof/base/iof_base_setup.c)! I was about to comment here that `ioctl()` returns zero on success, as opposed to negative on error; but I noticed the change you applied to upstream not only uses that, but also includes all the necessary checks for ioctl support, and implements very sensible, useful logic wrt. `usepty` on top. This warmed my heart: someone still cares about us end-users! Good work; thank you. – Blabbo the Verbose Jul 05 '21 at 23:49
  • Sure, by the way, there is (many) more than one person caring about end-users. Also note that if the terminal in which `mpirun` is invoked is resized, MPI ranks won't get notified (I have no plan to implement `SIGWINCH` handling, but Pull Requests are always welcome). – Gilles Gouaillardet Jul 06 '21 at 00:24
  • 1
    (Many) more in the OpenMPI project, for sure! (I had a purely positive experience during my active years, too.) I was really referring to how it seems so many projects important to their end users nowadays do not reciprocate in the sentiment; which is why seeing this happen was so surprising to me. :-) – Blabbo the Verbose Jul 09 '21 at 00:37