Capture output from console program with overlapping and events

Question

I know lots of similar questions on this topic have been asked before but so far I have been unable to find a solution that actually works. I want to start a console program from my program and capture its output. My implementation should be in a way that is compatible with WaitForMultipleObjects(), i.e. I want to get notified whenever there is new data to read in the pipe.

My implementation is based on this example from MSDN. However, I had to modify it a little because I need overlapped I/O in order to be able to wait for ReadFile() to finish. So I'm using named pipes created using Dave Hart's MyCreatePipeEx() function from here.

This is my actual code. I have removed error checks for readability reasons.

HANDLE hReadEvent;
HANDLE hStdIn_Rd, hStdIn_Wr;
HANDLE hStdOut_Rd, hStdOut_Wr;
SECURITY_ATTRIBUTES saAttr; 
PROCESS_INFORMATION piProcInfo; 
STARTUPINFO siStartInfo;
OVERLAPPED ovl;
HANDLE hEvt[2];
DWORD mask, gotbytes;
BYTE buf[4097];

saAttr.nLength = sizeof(SECURITY_ATTRIBUTES); 
saAttr.bInheritHandle = TRUE; 
saAttr.lpSecurityDescriptor = NULL; 

MyCreatePipeEx(&hStdOut_Rd, &hStdOut_Wr, &saAttr, 0, FILE_FLAG_OVERLAPPED, FILE_FLAG_OVERLAPPED);        
MyCreatePipeEx(&hStdIn_Rd, &hStdIn_Wr, &saAttr, 0, FILE_FLAG_OVERLAPPED, FILE_FLAG_OVERLAPPED); 

SetHandleInformation(hStdOut_Rd, HANDLE_FLAG_INHERIT, 0);
SetHandleInformation(hStdIn_Wr, HANDLE_FLAG_INHERIT, 0);

memset(&piProcInfo, 0, sizeof(PROCESS_INFORMATION));
memset(&siStartInfo, 0, sizeof(STARTUPINFO));

siStartInfo.cb = sizeof(STARTUPINFO); 
siStartInfo.hStdError = hStdOut_Wr;
siStartInfo.hStdOutput = hStdOut_Wr;
siStartInfo.hStdInput = hStdIn_Rd;
siStartInfo.dwFlags |= STARTF_USESTDHANDLES;

CreateProcess(NULL, "test.exe", NULL, NULL, TRUE, 0, NULL, NULL, &siStartInfo, &piProcInfo);

hReadEvent = CreateEvent(NULL, TRUE, FALSE, NULL);      

for(;;) {

    int i = 0;

    hEvt[i++] = piProcInfo.hProcess;

    memset(&ovl, 0, sizeof(OVERLAPPED));
    ovl.hEvent = hReadEvent;

    if(!ReadFile(hStdOut_Rd, buf, 4096, &gotbytes, &ovl)) {     
        if(GetLastError() == ERROR_IO_PENDING) hEvt[i++] = hReadEvent;          
    } else {
        buf[gotbytes] = 0;
        printf("%s", buf);
    }               

    mask = WaitForMultipleObjects(i, hEvt, FALSE, INFINITE);

    if(mask == WAIT_OBJECT_0 + 1) {

        if(GetOverlappedResult(hStdOut_Rd, &ovl, &gotbytes, FALSE)) {                   
            buf[gotbytes] = 0;
            printf("%s", buf);
        }

    } else if(mask == WAIT_OBJECT_0) {

        break;
    }   
}

The problem with this code is the following: As you can see, I'm reading in chunks of 4kb using ReadFile() because I obviously don't know how much data the external program test.exe will output. Doing it this way was suggested here:

To read a variable amount of data from the client process just issue read requests of whatever size you find convenient and be prepared to handle read events that are shorter than you requested. Don't interpret a short but non-zero length as EOF. Keep issuing read requests until you get a zero length read or an error.

However, this doesn't work. The event object passed to ReadFile() as part of the OVERLAPPED structure will only trigger once there are 4kb in the buffer. If the external program just prints "Hello", the event won't trigger at all. There need to be 4kb in the buffer for hReadEvent to actually trigger.

So I thought I should read byte by byte instead and modified my program to use ReadFile() like this:

if(!ReadFile(hStdOut_Rd, buf, 1, &gotbytes, &ovl)) {

However, this doesn't work either. If I do it like this, the read event is not triggered at all which is really confusing me. When using 4096 bytes, the event does indeed trigger as soon as there are 4096 bytes in the pipe, but when using 1 byte it doesn't work at all.

So how am I supposed to solve this? I'm pretty much out of ideas here. Is there no way to have the ReadFile() event trigger whenever there is some new data in the pipe? Can't be that difficult, can it?

*The event object passed to ReadFile() as part of the OVERLAPPED structure will only trigger once there are 4kb in the buffer* - this of course not true. really correct in the quote above. the code in loop wrong (how minimum `hEvt[i++] = hReadEvent;` exactly wrong). example on msdn bad - really need create only single (2 handles) pipe pair, instead 2 (4 handles) pair. — RbMm, Jun 07 '19 at 18:26
@RbMm: I'm sorry but I don't understand. What exactly is wrong in the code above? Can you post a corrected version or point me at what is wrong? — Andreas, Jun 07 '19 at 18:36
for example you need init wait array only once before loop, but not inside loop many time. create 2 pair no sense. only 1 is enough. use events as completion the worst choice from 3 possible, but can be — RbMm, Jun 07 '19 at 18:45
of course can. but you need yourself understand self mistakes. begin from init wait array only once before loop — RbMm, Jun 07 '19 at 19:06
@RbMm: But I don't understand why this should be a bug. AFAICS your suggestion of initializing the wait array only once is an optimization but not a bug in my code. If you think it's a bug, then please explain why you think so or post a fixed version. — Andreas, Jun 07 '19 at 19:12
also result can depend from your *test.exe* - how it actual write data to pipe. are you test on *cmd.exe* first ? — RbMm, Jun 07 '19 at 19:22
@RbMm: Sorry, I don't see it. The `i` counter is reset to 0 for each loop iteration. So I don't see how `hEvt[i++]` could **ever** be called multiple times per loop iteration. `i` is always set to 0 at the loop top so `hReadEvent` can only ever be assigned to `hEvt[1]`. I think you are clearly wrong here with your assumption that this is a bug. Otherwise prove it please. — Andreas, Jun 07 '19 at 19:24
if want test this correct code - https://pastebin.com/vueDewgz. because you nothing write to pipe - enter cmd input yourself. test self code with cmd.exe too instead *test.exe*. printf buffer output to pipe. actually write nothing to it. not good use crt here. — RbMm, Jun 07 '19 at 19:48
also in your loop, if `ReadFile` return not `ERROR_IO_PENDING` - you will be wait for process exit only. infinite. and anyway use event completion worst choice. and in most case you not need wait for process exit at all. if your task only read from pipe - not wait for process exit — RbMm, Jun 07 '19 at 20:05
@RbMm: Ok, after some research I'm now convinced that what I want to do here won't work at all because of `stdio` buffering. I was of the opinion that `printf()` would automatically flush on a linefeed but apparently that is only the case when using a terminal. When redirecting output to a file or a pipe (as in my case), it doesn't flush on a linefeed but uses block buffering (probably 4k). On Linux you could use pseudoterminals to work around this issue but on Windows there's apparently no way to disable buffering when redirecting the `stdout` output of arbitrary programs to my program. — Andreas, Jun 08 '19 at 17:22
buffering effect from crt this only one side. another that your code (loop) anyway wrong. and many choice not the best — RbMm, Jun 08 '19 at 17:34
@RbMm: Yes, you're right that in case `ReadFile()` doesn't return `ERROR_IO_PENDING` the loop will only wait on `hProcess` which is clearly a bug. But since there's no way to solve the CRT buffering issue I won't fix it because what I want to do seems generally impossible because of CRT buffering. — Andreas, Jun 08 '19 at 18:44
but simply not use crt functions for output. use direct `WriteFile` for example. the *cmd.exe* work ok here, because not use crt. and how i say many times - not need create 2 pipe pair. only single — RbMm, Jun 08 '19 at 19:00
@RbMm: My aim was to write a program that can capture the output of arbitrary console programs in real time. I don't have the source code of those programs so I can neither make them use `WriteFile()` nor can I `fflush()` buffers after every `printf()` because those are programs made by 3rd parties. It's impossible to catch their output in real time. — Andreas, Jun 08 '19 at 19:12

score 0 · Answer 1 · answered Jun 08 '19 at 19:30

Just for the record, while there are some problems with my code (see discussion in comments below the OP), the general problem is that it's not really possible to capture the output of arbitrary external programs because they will typically use block buffering when their output is redirected to a pipe, which means that output will only arrive at the capturing program once that buffer is flushed so real time capturing is not really possible.

Some workarounds have been suggested though:

1) (Windows) Here is a workaround that uses GetConsoleScreenBuffer() to capture the output from arbitrary console programs but it currently only supports one console page of output.

2) (Linux) On Linux, it's apparently possible to use pseudo-terminals to force the external program to use unbuffered output. Here is an example.

Capture output from console program with overlapping and events

1 Answers1