1


I know there are many questions here that talk about write, read system calls and file descriptors in C. I tried searching a bit but I really couldn't find what I'm looking for so...

EOF on new line when reading from STDIN_FILENO?

Let's say I have this code in test.c:

int main(void){
    char buff[8];
    int res;
    res=read(STDIN_FILENO, buff, 8);
    assert(res!=-1);
    res=write(STDOUT_FILENO, buff, res);
    assert(res!=-1);
    return 0
}

with header files included and all. Why do I have this behavior when executing the file test?

$ ./test
1
1
$ ./test
11$

In the first execution I use a new line (enter) while in the second execution I use ctrl+d to put an eof. Why does a new line character act like an eof in this case?

I mean read should try to read 8 characters. Shouldn't it have waited in the first execution? what if I wanted to type '1\n2\n3' in the buffer? How to make my program work?

I've got two ideas but I don't know how to verify them:

  1. Is this behavior related in any way to how processes read from a pipe? The idea of reading only what's in there even if more could be coming? If so: how to make the read call a blocking system call? (I hope I'm not misusing the term).
  2. Could the terminal be putting an EOF character when a new line character is typed? I don't think it's true but if I type 123456789+enter while executing test stops taking input after the enter key, uses characters from 1 to 8 and the terminal executes 9 as command (which means it takes the 9 and the newline character). I just don't get it.

Why is the program still working if STDIN_FILENO and STDOUT_FILENO are switched by mistake?

So now I have another code, test2.c:

int main(void){
    char buff[8];
    int res;
    res=read(STDOUT_FILENO, buff, 8);
    assert(res!=-1);
    res=write(STDIN_FILENO, buff, res);
    assert(res!=-1);
    return 0;
}

And I have the same behavior in both cases. Why don't read and write fail with EBADF ? Could this be some subtle tty-shell interaction thing I'm not aware of?

Thanks!!

N.B: If you have any suggestions on how to improve this question or the title please let me know.

  • The newline acts like end of line and sends the data to the program(s) reading from the terminal. You get EOF when `read()` returns 0. – Jonathan Leffler Dec 19 '17 at 03:35
  • So why does the read system call return? Why doesn't it wait for an EOF? If I use read to read from a file it doesn't return if it encounters an end of line right? Why is the stdin treated differently? – glamorous_noob Dec 19 '17 at 03:40
  • The terminal driver makes input available when you type newline, or when you type control-D. In neither case are you getting EOF. In the first, there are two characters to read; in the second, one. Read returns what is available up to the limit. It does not wait for the buffer to be filled. – Jonathan Leffler Dec 19 '17 at 03:42
  • As to swapped channels, the classic mechanism for opening terminals is to open it for read-write and make each of the standard I/O descriptors refer to it. This means you can often read either standard output or standard error, and write to standard input. It ain’t guaranteed, but it is commonly an option. – Jonathan Leffler Dec 19 '17 at 03:48
  • And, fundamentally, terminals are treated differently from disk files and pipes because sensible behaviour requires that. Think about it. One part of the terminal is the keyboard; another part is the screen. It’s very different from a disk file. – Jonathan Leffler Dec 19 '17 at 03:50
  • Alright now I get it! Thanks a lot! – glamorous_noob Dec 19 '17 at 03:57

1 Answers1

2

Spinning a bunch of comments typed on an iPhone into the semblance of an answer.

Program 1

The newline acts like end of line and sends the data to the program(s) reading from the terminal. You get EOF when read() returns 0. It only returns -1 if there is an error.

So why does the read system call return? Why doesn't it wait for an EOF? If I use read to read from a file it doesn't return if it encounters an end of line right? Why is the stdin treated differently?

It isn't standard input that's treated differently; it is terminals that are treated differently. (See also Canonical vs non-canonical terminal input.) Normally (in canonical mode), the terminal driver makes input available when you type newline, or when you type Control-D. In neither case are you getting EOF. In the first, there are two characters to read; in the second, one. Read returns what is available up to the limit. It does not wait for the buffer to be filled. If you type newline (and the program reads it) and then Control-D, there are no characters waiting to be read, so read() returns 0, indicating EOF. A program can ignore that and try reading again; depending on the file type, it might or might not get another EOF indication immediately. When you type 1 and Control-D, then there is one character available, so read() reports that there was one character available. To get EOF (in a looping program), you'd have to type Control-D a second time, which would report 0 bytes available, or EOF. Note that even a disk file doesn't wait to fill the buffer if there aren't that many bytes left in the file; short reads are completely normal, but even more common with terminals than with disk files. Pipes, FIFOs, sockets all have slightly different rules; special devices (like /dev/null, /dev/zero, /dev/random, and so on) have different rules too.

And, fundamentally, terminals are treated differently from disk files and pipes because sensible behaviour requires that. Think about it. One part of the terminal is the keyboard; another part is the screen. It’s very different from a disk file. Also, it would be a nuisance to have to guess how many (and which) characters to type because the terminal input was not available until the buffer could be filled.

Program 2

As to swapped channels, the classic mechanism for opening terminals is to open it for read-write and make each of the standard I/O descriptors refer to it. If there are no file descriptors open (and there aren't when you get a getty/login process), then the lowest unopened file descriptor will be returned by open() or dup(). Standard code runs along the lines of:

char *tty_name = …;

int fd = open(tty_name, O_RDWR);
dup(fd);
dup(fd);

So now file descriptors 0, 1, 2 all refer to the terminal named in tty_name. This means you can often read either standard output or standard error, and write to standard input. It ain’t guaranteed (it won't work if standard input is piped from one program and standard output is piped to another), but it is commonly an option for programs run with the terminal as standard input and standard output (and standard error).

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks a lot! I already knew about short reads but I wasn't at all aware of how the terminal treats input and relays it to the program. For the second program, I had found out the bug because of piping, and then I wondered why it did work in the first place! It's all much clearer now thanks to your answer :) – glamorous_noob Dec 19 '17 at 06:36