I've done enough experimentation with some command line tools of my own devising to convince me that the problem is the way Python works in this piped context. It doesn't process the data until it gets EOF (because the data fed isn't big enough to fill buffers). So, despite the best efforts of raw_input
and attempts to set line buffering before Python is executed, it isn't reading data as soon as lines are available. My test pipeline has 6 processes in a row:
timecmd -m -- cat -u | tstamp -f3 | tee /dev/tty | timecmd -m -- pycalc.py |
tstamp -f3 | tee $(isodate -c).log
The timecmd
command reports the start time of the command, runs the command, and reports the exit time (-m
for millisecond times). The tstamp -f3
prefixes each line of output with a timestamp (-f3
for millisecond times). Using cat -u
does unbuffered output; it reports the lines as they're available; the tee /dev/tty
shows the timestamped output on the terminal and feeds it to pycalc.py
; the output is again timestamped, and then piped to tee
for logging (the combination with isodate -c
generates a log file name such as 20180924.072626.log
). When I use this, I see clearly that pycalc.py
doesn't do a thing until I indicate EOF to the cat -u
.
$ timecmd -m -- cat -u | tstamp -f3 | tee /dev/tty | timecmd -m -- pycalc.py |
> tstamp -f3 | tee $(isodate -c).log
2018-09-24 07:29:06.419 [PID 44960] cat -u
2018-09-24 07:29:06.419 [PID 44961] pycalc.py
dude
2018-09-24 07:29:09.274: dude
how
2018-09-24 07:29:10.819: how
are
2018-09-24 07:29:12.349: are
you
2018-09-24 07:29:16.540: you
2018-09-24 07:29:26.205 [PID 44960; status 0x0000] - 19.786s
2018-09-24 07:29:26.209 [PID 44961; status 0x0000] - 19.789s
2018-09-24 07:29:26.208: MjAxOC0wOS0yNCAwNzoyOTowOS4yNzQ6IGR1ZGU=
2018-09-24 07:29:26.209:
2018-09-24 07:29:26.209: MjAxOC0wOS0yNCAwNzoyOToxMC44MTk6IGhvdw==
2018-09-24 07:29:26.209:
2018-09-24 07:29:26.209: MjAxOC0wOS0yNCAwNzoyOToxMi4zNDk6IGFyZQ==
2018-09-24 07:29:26.209:
2018-09-24 07:29:26.209: MjAxOC0wOS0yNCAwNzoyOToxNi41NDA6IHlvdQ==
2018-09-24 07:29:26.209:
2018-09-24 07:29:26.209: Bye - End of Calc!
So, AFAICT, you need to find a way to make Python read lines from standard input when standard input comes from a pipe — and to read them before the buffer is full. Until then, you won't be successful. You may have similar problems with ensuring the output is flushed properly (timely) too. You could further refine the testing I did by timestamping the input Python sees, maybe reporting the timestamped data to standard error (so you know what Python is doing), but I'm moderately convinced that my setup sends the data to Python timely but Python is not recognizing that.
Searching with Google on 'python line buffered standard input' comes up with Setting smaller buffer size for sys.stdin which suggests using python -u
. When I use timecmd -m -- python -u pycalc.py
in my long command line, then I get the expected output — Python responds as the data is generated.
$ timecmd -m -- cat -u | tstamp -f3 | tee /dev/tty | timecmd -m -- python -u pycalc.py |
> tstamp -f3 | tee $(isodate -c).log
2018-09-24 07:52:41.485 [PID 45180] cat -u
2018-09-24 07:52:41.485 [PID 45181] python -u pycalc.py
dude
2018-09-24 07:52:43.213: dude
2018-09-24 07:52:43.214: MjAxOC0wOS0yNCAwNzo1Mjo0My4yMTM6IGR1ZGU=
2018-09-24 07:52:43.215:
how
2018-09-24 07:52:48.852: how
2018-09-24 07:52:48.852: MjAxOC0wOS0yNCAwNzo1Mjo0OC44NTI6IGhvdw==
2018-09-24 07:52:48.852:
are
2018-09-24 07:52:50.720: are
2018-09-24 07:52:50.720: MjAxOC0wOS0yNCAwNzo1Mjo1MC43MjA6IGFyZQ==
2018-09-24 07:52:50.720:
you
2018-09-24 07:52:52.479: you
2018-09-24 07:52:52.479: MjAxOC0wOS0yNCAwNzo1Mjo1Mi40Nzk6IHlvdQ==
2018-09-24 07:52:52.479:
2018-09-24 07:52:53.646 [PID 45180; status 0x0000] - 12.161s
2018-09-24 07:52:53.647: Bye - End of Calc!
2018-09-24 07:52:53.650 [PID 45181; status 0x0000] - 12.165s
You can see from the timestamps how this is reacting as the typing (deliberately slow typing) occurs. Translating this into what your program needs to do, I come up with:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define PARENT_READ read_pipe[0]
#define PARENT_WRITE write_pipe[1]
#define CHILD_WRITE read_pipe[1]
#define CHILD_READ write_pipe[0]
int main(void)
{
int pid;
int read_pipe[2];
int write_pipe[2];
if (pipe(read_pipe) < 0)
exit(-1);
if (pipe(write_pipe) < 0)
exit(-1);
if ((pid = fork()) < 0)
exit(-1);
else if (pid == 0)
{
// Child process go here
close(PARENT_READ);
close(PARENT_WRITE);
dup2(CHILD_READ, STDIN_FILENO);
dup2(CHILD_WRITE, STDOUT_FILENO);
close(CHILD_READ);
close(CHILD_WRITE);
char *argv[] = {"python", "-u", "pycalc.py", NULL};
execvp(argv[0], argv);
fprintf(stderr, "failed to execute %s (%d: %s)\n", argv[0], errno, strerror(errno));
exit(errno);
}
// Parent process go here
close(CHILD_READ);
close(CHILD_WRITE);
FILE *wrp = fdopen(PARENT_WRITE, "w");
FILE *rdp = fdopen(PARENT_READ, "r");
if (wrp == NULL || rdp == NULL)
exit(-1);
setlinebuf(wrp);
setlinebuf(rdp);
char line[1024 + 1];
while (fgets(line, sizeof(line), stdin) != NULL)
{
printf("Got this to do calc with: %s", line);
fprintf(wrp, "%s", line);
fflush(wrp);
fgets(line, sizeof(line), rdp);
printf("Got this calc from child: %s", line);
}
return(0);
}
And when run (the program is pipe43
), I get:
$ pipe43
dude
Got this to do calc with: dude
Got this calc from child: ZHVkZQ==
how
Got this to do calc with: how
Got this calc from child:
are
Got this to do calc with: are
Got this calc from child: aG93
you
Got this to do calc with: you
Got this calc from child:
Traceback (most recent call last):
File "pycalc.py", line 15, in <module>
print('Bye - End of Calc!')
IOError: [Errno 32] Broken pipe
The error from Python is because it tries to write after the parent process has closed its pipe — it doesn't wait for stray output from the Python.
I'm not sure of the efficiency implications of the -u
option to Python; I suspect they're not great.