1

I am unable to understand the way in which CtrlZ is working. Please explain the following output with reason.

#include <stdio.h>

int main (void) {
    int ch, i = 0;

    while ((ch = getchar()) != EOF)
        i++;

    printf("\n%d", i);
    return 0;
}

Input 1:

my  
^Z 

output 1:

3  

Input 2:

my^Zmy  
my  
^Z  

output 2:

6  

Input 3:

my^Zmy  
my^Z  
^Z  

Output 3:

6  
chqrlie
  • 131,814
  • 10
  • 121
  • 189

3 Answers3

2

"Why it doesn't stop at the first CTRL-z is explained in one of the answers from Why doesn't getchar() recognise return as EOF on the console?

With Windows, the CTRLz can be entered anywhere on the line, but still needs to be followed by a newline.

That accounts for case 1 (my + linefeed) => 3 chars

For other inputs, it's clear that the CTRLz which stops the input is the last one, followed by a newline. It seems that CTRLz not alone on the line zaps the characters after it until the end of line, which would account for the 6 result in both cases.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
1

The end of line (EOL) character is also read and counted by getchar(), so it is included in your counts.

Ctrl-Z

Console input is often (and is on Windows), line buffered, meaning that your program will not see anything until the user presses Enter.

Hence, you can type ^Z anywhere, but until you press Enter the text is not sent to your program’s input buffer to be read.

OS-issues

On Linux (and other *nixen) the EOL character is LF ('\n'). But on Windows it is a character sequence: CR LF ("\r\n").

In order to make the same code work on both *nix and Windows, when C opens the console file stream, it does so in text mode, which is otherwise identical to binary mode except that CR LF is reported to you as just LF. Hence, your experiments above report three characters ('m', 'y', and '\n') instead of four.

Community
  • 1
  • 1
Dúthomhas
  • 8,200
  • 2
  • 17
  • 39
0

On the Windows command line, the default behavior is that:

  • ^Z only behaves as EOF when it is at the start of a line
  • if the ^Z is elsewhere in the line, then characters after the ^Z are dropped from the stream (including the newline that causes the buffer to be sent to the application), but no EOF condition is raised. However the ^Z character itself is still returned by the stream.

Also, the ^Z character isn't acted upon immediately by the console - the line is still buffered and does not get sent to the application until the return key is pressed.

Keep in mind that your program counts newlines, so:

  • in example 1 there's a newline that is part of the 3 result
  • in example 2 the first ^Z is counted, the last half of the first line is not counted, and the newline from (only) the second line is counted
  • in example 3 the first two ^Z characters are counted, no newlines are counted, and the last half of the first line is not counted

On Unix systems ^D is used as the EOF character on the console, and it's behavior is standardized like so (slightly different than Windows' behavior):

[EOF is a] Special character on input, which is recognized if the ICANON flag is set. When received, all the bytes waiting to be read are immediately passed to the process without waiting for a newline, and the EOF is discarded. Thus, if there are no bytes waiting (that is, the EOF occurred at the beginning of a line), a byte count of zero shall be returned from the read(), representing an end-of-file indication.

The main difference is that on Unix the EOF key doesn't wait for return to be pressed. The other difference is that on Windows if the EOF isn't at the start of the line the ^Z character shows up in the stream.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760