Why does getchar() recognize EOF only in the beginning of a line?

Question

This example is from the K&R book

#include<stdio.h>


main()
{
    long nc;

    nc = 0;
    while(getchar() != EOF)
        ++nc;
    printf("%ld\n", nc);
}

enter image description here

Could you explain me why it works that way. Thanks.

^Z^Z doesn't work either (unless it's in the beginning of a line)

enter image description here

That's a "feature" of the Windows shell. On Unix, you can type EOF at the end of a line by typing Ctrl+D twice; try typing Ctrl+Z twice. (Or redirect input from a file.) — Fred Foo, Jan 21 '13 at 10:27
I know this is completely irrelevant and sorry for nitpicking, but it'd be good if one doesn't waste space with images, when plain text would perfectly suffice, and in this situtation, SO would nicely syntax highlight too :) — legends2k, Jan 21 '13 at 13:21

score 1 · Accepted Answer · answered Jan 21 '13 at 11:00

Traditional UNIX interpretation of tty EOF character is to make blocking read return after reading whatever is buffered inside a cooked tty line buffer. In the start of a new line, it means read returning 0 (reading zero bytes), and incidentally, 0-sized read is how the end of file condition on ordinary files is detected.

That's why the first EOF in the middle of a line just forces the beginning of the line to be read, not making C runtime library detect an end of file. Two EOF characters in a row produce 0-sized read, because the second one forces an empty buffer to be read by an application.

$ cat
foo[press ^D]foo <=== after ^D, input printed back before EOL, despite cooked mode. No EOF detected
foo[press ^D]foo[press ^D] <=== after first ^D, input printed back, and on second ^D, cat detects EOF

$ cat
Some first line<CR> <=== input
Some first line <=== the line is read and printed
[press ^D] <=== at line start, ^D forces 0-sized read to happen, cat detects EOF

I assume that your C runtime library imitates the semantics described above (there is no special handling of ^Z at the level of kernel32 calls, let alone system calls, on Windows). That's why it would probably detect EOF after ^Z^Z even in the middle of an input line.

No, it wouldn't. ^Z^Z or more doesn't work if there are another characters in this line before it. — Vorgin, Jan 21 '13 at 13:18

score 0 · Answer 2 · answered Jan 21 '13 at 13:25

The program will read EOF only at the actual end of the input. If your terminal/OS/whatever only permit files to end at the start of a line then that's where you'll find them. I believe this is a throw-back to old-fashioned terminals where data was only transmitted a line at a time (for all I know it goes back to punched card readers).

Try reading your data from a file that you've preprepared with an EOF mid-line. You may even find that some editors make this difficult! Your program should work fine with that as input.

sr01853 · Answer 3 · 2013-01-21T14:05:25.287

EOF indicates "end of file". A newline (which is what happens when you press enter) isn't the end of a file, it's the end of a line, so a newline doesn't terminate this loop.

Depending on the operating system, EOF character will only work if it's the first character on a line, i.e. the first character after an Enter. Since console input is often line-oriented, the system may also not recognize the EOF character until after you've followed it up with an Enter.

score 0 · Answer 4 · edited Jan 25 '13 at 09:33

I happened to have the same question as you. When I want to end the function getchar(), I have to enter 2 EOF or enter a <ENTER> plus a EOF.

And here's an easier answer I searched about this question:

If there is characters entering in the terminal, EOF will play the role as stopping this entering, which will arouse a new turn of entering; while, if there is no entering happening, or in another word, when the getchar() is waiting for a new enter(such as you've just finished entering or a EOF), the EOF you are about to enter now equals "end of file", which will lead the program stop executing the function getchar().

PS: the question happens when you are using getchar(). I think this answer is easier to understand, but maybe not for you since it is translated from Chinese...

Why does getchar() recognize EOF only in the beginning of a line?

4 Answers4

Linked