5

I'm trying to capture the output of a command run using Posix.Process.execp. I ported some C code I found on stackoverflow and can capture output for one execution, but I can't get the output for a second execution.

Here's my function:

(* Runs a command c (command and argument list) using Posix.Process.execp. *)
(* If we successfully run the program, we return the lines output to stdout *)
(* in a list, along with SOME of the exit code. *)
(* If we fail to run the program, we return the error message in the list *)
(* and NONE. *)
fun execpOutput (c : string * string list) : (string list * Posix.Process.exit_status option) =
  let fun readAll () = case TextIO.inputLine TextIO.stdIn
                    of SOME s => s :: (readAll ())
                     | NONE => []
      (* Create a new pipe *)
      val { infd = infd, outfd = outfd } = Posix.IO.pipe ()
  in case Posix.Process.fork ()
      of NONE => (
      (* We are the child. First copy outfd to stdout; they will *)
      (* point to the same file descriptor and can be used interchangeably. *)
      (* See dup(2) for details. Then close infd: we don't need it and don't *)
      (* want to block because we have open file descriptors laying around *)
      (* when we want to exit. *)
      ( Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stdout }
      ; Posix.IO.close infd
      ; Posix.Process.execp c )
      handle OS.SysErr (err, _) => ([err], NONE) )
       | SOME pid =>
     (* We are the parent. This time, copy infd to stdin, and get rid of the *)
     (* outfd we don't need. *)
     let val _ = ( Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }
                 ; Posix.IO.close outfd )
         val (_, status) = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
     in (readAll (), SOME status) end
  end

val lsls = (#1 (execpOutput ("ls", ["ls"]))) @ (#1 (execpOutput ("ls", ["ls"])))
val _ = app print lsls

and here's the corresponding output:

rak@zeta:/tmp/test$ ls
a  b  c
rak@zeta:/tmp/test$ echo 'use "/tmp/mwe.sml";' | sml
Standard ML of New Jersey v110.79 [built: Tue Aug  8 16:57:33 2017]
- [opening /tmp/mwe.sml]
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
a
b
c
val execpOutput = fn
  : string * string list -> string list * ?.POSIX_Process.exit_status option
val lsls = ["a\n","b\n","c\n"] : string list
val it = () : unit
-

Any suggestions on what I'm doing wrong?

Ryan Kavanagh
  • 251
  • 1
  • 7

2 Answers2

3

This is not a complete answer.

If you run sml interactively, you'll notice that the interpreter exits after the first call:

$ sml
Standard ML of New Jersey v110.81 [built: Wed May 10 21:25:32 2017]
- use "mwe.sml";
[opening mwe.sml]
...
- execpOutput ("ls", ["ls"]);
val it = (["mwe.sml\n"],SOME W_EXITED)
  : string list * ?.POSIX_Process.exit_status option
- 
$ # I didn't type exit ();

Your program appears to be an adaptation of this pipe/fork/exec example in C that does work. The only apparent difference is the C line fdopen(pipefd[0], "r") where you instead write Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }.

I might investigate if those are really intended to give the same result. You could run strace on each of the programs and see when their system calls deviate. I haven't made further sense of it, though.

I tried to run strace sml nwe.sml 2>&1 | grep -v getrusage on:

fun readAll () = case TextIO.inputLine TextIO.stdIn
                   of SOME s => s :: readAll ()
                    | NONE => []

fun execpOutput (c : string * string list) : (string list * Posix.Process.exit_status option) =
  let val { infd = infd, outfd = outfd } = Posix.IO.pipe ()
  in case Posix.Process.fork ()
          (* Child *)
       of NONE => (( Posix.IO.close infd
                   ; Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stdout }
                   ; Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stderr }
                   ; Posix.Process.execp c )
                  handle OS.SysErr (err, _) => ([err], NONE) )
         (* Parent *)
        | SOME pid =>
          let val _ = Posix.IO.close outfd
              val _ = Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }
              val _ = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
          in readAll () end
  end

val _ = app print (execpOutput ("ls", ["ls"]));
val _ = app print (execpOutput ("ls", ["ls"]));

just as I tried to run strace ./mwe on the compiled output of

#include <stdio.h>
#include <stdlib.h>
#include <spawn.h>
#include <sys/wait.h>
#include <unistd.h>

void execpOutput(char *cmd[])
{
    pid_t pid;
    int pipefd[2];
    FILE* output;
    char line[256];
    int status;

    pipe(pipefd);
    pid = fork();
    if (pid == 0) {
        close(pipefd[0]);
        dup2(pipefd[1], STDOUT_FILENO);
        dup2(pipefd[1], STDERR_FILENO);
        execvp(cmd[0], cmd);
    } else {
        close(pipefd[1]);
        output = fdopen(pipefd[0], "r");
        while (fgets(line, sizeof(line), output)) {
            printf("%s", line);
        }
        waitpid(pid, &status, 0);
    }
}

int main(int argc, char *argv[])
{
    char *cmd[] = {"ls", NULL};

    execpOutput(cmd);
    execpOutput(cmd);
    return 2;
}
sshine
  • 15,635
  • 1
  • 41
  • 66
  • 1
    Thanks Simon! I ran it through strace and found that though the second child was writing to some file descriptor N and the parent's stdin was set to N by dup2, the parent never actually read any messages the second time around. I'll post a more detailed analysis and how I fixed it tomorrow. – Ryan Kavanagh Sep 14 '17 at 00:43
3

My initial attempt consisted of

  1. Creating a pipe
  2. Setting stdout of the child to be the write end of the pipe
  3. Setting stdin of the parent to be the read end of the pipe

This didn't work the second time around, perhaps because of some race condition (running it under strace -f meant that we could see the second child writing to the write end of the second pipe, but the parent never managed to read from the read end of the second pipe). I realised this approach is also suboptimal because it involves clobbering stdin.

My officemate pointed out I was effectively trying to implement a variant of popen(3). A better approach is to, in fact, implement popen and return a file descriptor for the desired end of the pipe, rather than clobbering the parent's stdin/stdout. It also is symmetric, in that the user can specify if they want the read or the write end of the pipe. Here's what I came up with (feedback welcome).

structure Popen :>
      sig
          (* Parent wants to write to stdin, read stdout, or read stdout + stderr *)
          datatype pipe_type = PIPE_W | PIPE_R | PIPE_RE
          val popen : string * pipe_type -> Posix.IO.file_desc
          val pclose : Posix.IO.file_desc -> Posix.Process.exit_status option
      end =
struct

datatype pipe_type = PIPE_W | PIPE_R | PIPE_RE

type pinfo = { fd : Posix.ProcEnv.file_desc, pid : Posix.Process.pid }

val pids : pinfo list ref = ref []

(* Implements popen(3) *)
fun popen (cmd, t) =
  let val { infd = readfd, outfd = writefd } = Posix.IO.pipe ()
  in case (Posix.Process.fork (), t)
      of (NONE, t) => (* Child *)
     (( case t
         of PIPE_W => Posix.IO.dup2 { old = readfd, new = Posix.FileSys.stdin }
          | PIPE_R => Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stdout }
          | PIPE_RE => ( Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stdout }
                       ; Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stderr })
      ; Posix.IO.close writefd
      ; Posix.IO.close readfd
      ; Posix.Process.execp ("/bin/sh", ["sh", "-c", cmd]))
      handle OS.SysErr (err, _) =>
             ( print ("Fatal error in child: " ^ err ^ "\n")
             ; OS.Process.exit OS.Process.failure ))
       | (SOME pid, t) => (* Parent *)
     let val fd = case t of PIPE_W => (Posix.IO.close readfd; writefd)
                          | PIPE_R => (Posix.IO.close writefd; readfd)
                          | PIPE_RE => (Posix.IO.close writefd; readfd)
         val _ = pids := ({ fd = fd, pid = pid } :: !pids)
     in fd end
  end

(* Implements pclose(3) *)
fun pclose fd =
  case List.partition (fn { fd = f, pid = _ } => f = fd) (!pids)
   of ([], _) => NONE
    | ([{ fd = _, pid = pid }], pids') =>
      let val _ = pids := pids'
      val (_, status) = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
      val _ = Posix.IO.close fd
      in SOME status end
    | _ => raise Bind (* This should be impossible. *)
end

val f = Popen.popen("ls", Popen.PIPE_R);
val g = Popen.popen("read line; echo $line>/tmp/foo", Popen.PIPE_W);
val _ = Posix.IO.writeVec (g, Word8VectorSlice.full (Byte.stringToBytes "Hello World! I was written by g\n"));
val h = Popen.popen("cat /tmp/foo", Popen.PIPE_R);
val i = Popen.popen("echo 'to stderr i' 1>&2", Popen.PIPE_R);
val j = Popen.popen("echo 'to stderr j' 1>&2", Popen.PIPE_RE);
val _ = app (fn fd => print (Byte.bytesToString (Posix.IO.readVec (fd, 1000)))) [f, h, i, j];
val _ = map Popen.pclose [f, g, h, i, j];
val _ = OS.Process.exit OS.Process.success;

And then the output is:

rak@zeta:~/popen$ rm /tmp/foo && ls && sml popen.sml
popen.sml
Standard ML of New Jersey v110.79 [built: Tue Aug  8 16:57:33 2017]
[opening popen.sml]
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
popen.sml:42.52 Warning: calling polyEqual
structure Popen :
  sig
    datatype pipe_type = PIPE_R | PIPE_RE | PIPE_W
    val popen : string * pipe_type -> ?.POSIX_IO.file_desc
    val pclose : ?.POSIX_IO.file_desc -> ?.POSIX_Process.exit_status option
  end
val f = FD {fd=4} : ?.POSIX_IO.file_desc
val g = FD {fd=6} : ?.POSIX_IO.file_desc
[autoloading]
[autoloading done]
val h = FD {fd=5} : ?.POSIX_IO.file_desc
to stderr i
val i = FD {fd=7} : ?.POSIX_IO.file_desc
val j = FD {fd=8} : ?.POSIX_IO.file_desc
popen.sml
Hello World! I was written by g
to stderr j

Thanks to Simon Shine for the hint to run strace. I'm still not sure why what I had doesn't work, but at least we know what's happening.

Ryan Kavanagh
  • 251
  • 1
  • 7