OS X
For OS X, you can look at the source code for head
and the source code for tail
to figure out some of what's going on. In the case of tail
, you'll want to look at forward.c
.
So, it turns out that head
doesn't do anything special. It just reads its input using the stdio
library, so it reads a buffer at a time and might read too much. This means cat file | (head; tail)
won't work for small files where head
's buffering makes it read some (or all) of the last 10 lines.
On the other hand, tail
checks the type of its input file. If it's a regular file, tail
seeks to the end and reads backwards until it finds enough lines to emit. This is why (head; tail) < file
works on any regular file, regardless of size.
Linux
You could look at the source for head
and tail
on Linux too, but it's easier to just use strace
, like this:
(strace -o /tmp/head.trace head; strace -o /tmp/tail.trace tail) < file
Take a look at /tmp/head.trace
. You'll see that the head
command tries to fill a buffer (of 8192 bytes in my test) by reading from standard input (file descriptor 0). Depending on the size of file
, it may or may not fill the buffer. Anyway, let's assume that it reads 10 lines in that first read. Then, it uses lseek
to back up the file descriptor to the end of the 10th line, essentially “unreading” any extra bytes it read. This works because the file descriptor is open on a normal, seekable file. So (head; tail) < file
will work for any seekable file, but it won't make cat file | (head; tail)
work.
On the other hand, tail
does not (in my testing) seek to the end and read backwards, like it does on OS X. At least, it doesn't read all the way back to the beginning of the file.
Here's my test. Create a small, 12-line input file:
yes | head -12 | cat -n > /tmp/file
Then, try (head; tail) < /tmp/file
on Linux. I get this with GNU coreutils 5.97:
1 y
2 y
3 y
4 y
5 y
6 y
7 y
8 y
9 y
10 y
11 y
12 y
But on OS X, I get this:
1 y
2 y
3 y
4 y
5 y
6 y
7 y
8 y
9 y
10 y
3 y
4 y
5 y
6 y
7 y
8 y
9 y
10 y
11 y
12 y