0

I'm not really sure even what is the subject, initially I was struggling with grep with a suggestion the it is a root cause yet seems it is something related to pipes buffering or something

On Alpine Linux v3.18 using inotifyd (fs changes tracking tool) in pipe produces a strange behavior As a data source will be using same command inotifyd - /tmp:ymndceDM together with serial call similar to echo 1 > /tmp/2.log

Running a normal call as

# inotifyd - /tmp:ymndceDM | xxd
00000000: 6309 2f74 6d70 0932 2e6c 6f67 0a63 092f  c./tmp.2.log.c./
00000010: 746d 7009 322e 6c6f 670a 6309 2f74 6d70  tmp.2.log.c./tmp
00000020: 0932 2e6c 6f67 0a63 092f 746d 7009 322e  .2.log.c./tmp.2.
00000030: 6c6f 670a 6309 2f74 6d70 0932 2e6c 6f67  log.c./tmp.2.log
00000040: 0a63 092f 746d 7009 322e 6c6f 670a 6309  .c./tmp.2.log.c.
00000050: 2f74 6d70 0932 2e6c 6f67 0a63 092f 746d  /tmp.2.log.c./tm

# inotifyd - /tmp:ymndceDM | grep ''
c   /tmp    2.log
c   /tmp    2.log
c   /tmp    2.log
c   /tmp    2.log

produces pretty expected output

Yet if a pipeline is extended with extra commands an output stacks at a single line state as

# inotifyd - /tmp:ymndceDM | xxd | cat
00000000: 6309 2f74 6d70 0932 2e6c 6f67 0a63 092f  c./tmp.2.log.c./

# inotifyd - /tmp:ymndceDM | grep '' | cat
c   /tmp    2.log

regardless of serial incoming changes to subject file.

Using only cats instead of xxs or grep works fine as well

Taking separately output from inotifyd into some log file and applying cat changes.log to the same (initially problematic) pipeline doesn't give described problems.

So please how this could be explained and resolved?

Thx

Trying to turn buffering (as my second suspection) off found suggestion to run commands in separate groups as

{ inotifyd - /tmp:ymndceDM; } | { grep ''; } | { cat; }

but that didn't really help

Edited: Thanks to @programmerq answer bellow it turned out the buffering is the behavior of xxd it self in case it is down-streamed to a non tty.

I confirmed as well the same for grep. Actually trying

inotifyd - /tmp:ymndceDM \
  | xargs -n 1 -I {} \
   echo {} somenoicesomenoicesomenoicesomenoicesomenoicesomenoicesomenoicesomenoicesomenoicesomenoicesomenoicesomenoi \
  | grep '' | cat

makes overflow the buffer each line and leads to a "realtime" output.

So the question now is how to disable buffering in more pure way?

--line-buffered is not supported for alpine and no other available options seem to help.

In my particular case I can see another workaround to multiply incoming lines by some factor as anyway the is a uniq below in the line. But this doesn't seem to be such a pretty solution. (e.g. Could the buffer size supposed to be constant)

Anyway is it possible maybe to fake grep somehow (with minimal code overhead) to make it think it is feeded down to tty?

Or what could be the resolution here to get a realtime stream of updates regardless chunk size?

Thx.

0xffaaec
  • 1
  • 1
  • The `--line-buffered` argument is available on the grep included in the alpine `grep` package. The busybox `grep` applet that is available if you don't explicitly install the `grep` package doesn't support `--line-buffered`. Your trick to pack each line with extra input is probably your best bet for binaries that don't have a way to disable this behavior. – programmerq Jul 14 '23 at 15:33

1 Answers1

0

I was able to recreate this behavior. Both the xxd busybox applet and the xxd binary from the xxd package had the same behavior.

I was able to show that it is indeed xxd that appears to be buffering output.

inotifyd - /tmp/tmp.lKFfOk | tee foo | xxd | cat

I can tail -f foo and see that the output from inotifyd seems to appear immediately. The xxd program seems to buffer things when its stdout isn't a tty.

If I force more activity in the directory that inotifyd is watching, the |xxd|cat pipe will output the new activity in chunks. If I kill the inotifyd process, the |xxd|cat pipe will flush all the remaining output.

Since other programs than xxd don't appear to buffer, this isn't a shell level pipe buffer that is happening. xxd does need to output things in whole lines, and it looks like it wants to have several lines of its output before it'll flush the buffer. (in my case 15 lines of xxd output appeared at a time)

I dug into the xxd source code a bit, and found that xxd uses fflush() when it writes its output.

According to this answer to a question specifically about this fflush behavior, when the output is a tty, it will be line buffered by default. When xxd is piped to another program, that isn't a tty, so it uses a different buffering approach that causes the chunking behavior after the first line, as seen here.

When xxd receives an EOF (like when I killed the inotifyd to make it exit), the whole buffer will be flushed before xxd exits.

programmerq
  • 6,262
  • 25
  • 40
  • Oh @programmerq that's a great Idea to locate a buffering source this way. Thank you!! I confirmed the same for the `grep` command. So is there any solution to change the buffering behavior? Please see my updated post – 0xffaaec Jul 03 '23 at 10:06