0

I am new to Linux and I am trying to grasp how pipes and buffers work. I read that if we type in the following in the terminal:

command1 | command2 

the buffer will be flushed to stdout when it contains let's say 4K of data. From another post on stackoverflow How to make output of any shell command unbuffered?

I found out that one can "turn off" the buffer or change the buffer strategy to line buffering using a pseudo terminal. How does actually a pseudo terminal work in this case and why is it possible to change the buffer strategy using a pseudo terminal?

Thank you!

Community
  • 1
  • 1
newbie
  • 31
  • 6
  • 1
    What "buffers" are you talking about? The standard C `stdout` buffers which are used by e.g. `printf` and are by default line buffered? The lower-level buffer used by the pipe itself? Some other kernel-level buffer? – Some programmer dude Feb 14 '17 at 10:34

3 Answers3

1

The reason using a pseudo terminal "works" is that the stdio library looks at whether or not the output is going to a terminal to decide which buffering strategy to use. The pseudo terminal makes it think it is talking to a terminal so it chooses the terminal strategy instead of the "pipe" strategy.

John Hascall
  • 9,176
  • 6
  • 48
  • 72
  • Does that mean that when my output goes to the terminal it is always line buffered? – newbie Feb 14 '17 at 10:46
  • Line-buffered is the default in that case, but the program can chose other options -- on UNIXy systems see "man setvbuf" for details. – John Hascall Feb 14 '17 at 11:50
1

There is a lot of "buffers" involved in your simple command.

  1. There can be some buffering inside the code of the command. For example, if one uses, say C I/Os then there is, by default on output, a buffer. If commands use system I/Os then there is no buffering on output.
  2. Pipes provides some kind of buffering, has it provides a producer/consumer semantic. Bytes are stored into the pipe until one read read them.
  3. The first command of the pipe may read from the tty and the last may write onto the tty, and ttys have a line discipline that may use a buffer.

From your point of view, as a user the only one you can play with is the line discipline of the terminal, such that the input provided can be delivered as soon as possible or with some kind of cooking or buffering. Command stty can be used to control all of this.

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69
0

The buffer will be flushed to stdout when it contains let's say 4K of data.

You are probably referring to the notorious PIPE_BUF POSIX requirement. It is not about flushing the pipe buffer after a certain size.

The PIPE_BUF requirement is about guaranteeing that when multiple processes each write less than PIPE_BUF to the same pipe, the reader will not see its input intermingled from different processes.

Suppose that PIPE_BUF is 4 (although it is required to be at least 512 and is 4096 by default on most systems), and two processes write to the same pipe:

ProcessA: write(pipe, "abcd", 4)
ProcessB: write(pipe, "EFG", 3)
ProcessC: read(pipe, buf, 7)

Since both processes each wrote less than or equal to 4 bytes, the receiver will either get abcdEFG or EFGabcd, but not abEFGcd or EFabcdG.

This is often wrongly interpreted as "a single write() of less than PIPE_BUF will be retrieved via a single read() on the other side".

Blagovest Buyukliev
  • 42,498
  • 14
  • 94
  • 130