5

Test

In order to find the behaviour of getline() when confronted with EOF, I wrote the following test:

int main (int argc, char *argv[]) {
    size_t max = 100;
    char *buf = malloc(sizeof(char) * 100);
    size_t len = getline(&buf, &max, stdin);
    printf("length %zu: %s", len, buf);
}

And input1 is:

abcCtrl-DEnter

Result:

 length 4: abc  //notice that '\n' is also taken into consideration and printed

Input2:

abcEnter

Exactly same output:

 length 4: abc

It seems that the EOF is left out out by getline()

Source code

So I find the source code of getline() and following is a related snippet of it (and I leave out some comments and irrelevant codes for conciseness):

 while ((c = getc (stream)) != EOF)
{
  /* Push the result in the line.  */
  (*lineptr)[indx++] = c;

  /* Bail out.  */
  if (c == delim)             //delim here is '\n'
   break;
}

/* Make room for the null character.  */
if (indx >= *n)
{
  *lineptr = realloc (*lineptr, *n + line_size);
  if (*lineptr == NULL)
   return -1;
  *n += line_size;
}

/* Null terminate the buffer.  */
(*lineptr)[indx++] = 0;

 return (c == EOF && (indx - 1) == 0) ? -1 : indx - 1;

Question

So my question is:

  • why length here is 4 (as far as I can see it should be 5)(as wiki says, It won't be a EOF if it not at the beginning of a line)

A similar question:EOF behavior when accompanied by other values but notice getline() in that question is different from GNU-getline

I use GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2

Community
  • 1
  • 1
Tony
  • 5,972
  • 2
  • 39
  • 58
  • As a test, use your progam's binary as its own input: `./a.out – wildplasser Aug 23 '14 at 12:29
  • See my answer here: http://stackoverflow.com/a/10766343/905902 for a demonstration of how the terminal handles "special" characters (It is about ^C, not ^D, but the mechanism is similar) – wildplasser Aug 23 '14 at 12:39
  • `len` is 4 because (`a` + `b` + `c` + `\n` = `4`). There is no need to use `ctrl+d`. Simply `a b c [enter]`. – David C. Rankin Aug 23 '14 at 19:55
  • @wildplasser Sorry for being so late. Thanks. Could you explain how did you reach the conclusion in detail for I didn't get it. – Tony Sep 01 '14 at 11:51
  • Most of it is covered by @mafso. My key point is: **there is no EOF character**. There (sometimes) is an EOF *condition* , which *sometimes* is translated into an int with value EOF. mafso 's reaction covers the cases where this reaction is delayed (by the terminal / the terminal driver / the buffering between the raw input and your program) – wildplasser Sep 01 '14 at 22:08

1 Answers1

3

Ctrl-D causes your terminal to flush the input buffer if it isn’t already flushed. Otherwise, the end-of-file indicator for the input stream is set. A newline also flushes the buffer.

So you didn't close the stream, but only flushed the input buffer, which is why getline doesn't see an end-of-file indicator.

In neither of these cases, a literal EOT character (ASCII 0x04, ^D) is received by getline (in order to do so, you can type Ctrl-VCtrl-D).

Type

abcCtrl-DCtrl-D

or

abcEnterCtrl-D

to actually set the end-of-file indicator.

From POSIX:

Special characters

  • EOF

Special character on input, which is recognized if the ICANON flag is set. When received, all the bytes waiting to be read are immediately passed to the process without waiting for a <newline>, and the EOF is discarded. Thus, if there are no bytes waiting (that is, the EOF occurred at the beginning of a line), a byte count of zero shall be returned from the read(), representing an end-of-file indication. If ICANON is set, the EOF character shall be discarded when processed.

FYI, the ICANON flag is specified here.

Community
  • 1
  • 1
mafso
  • 5,433
  • 2
  • 19
  • 40
  • Any reference about 'Ctrl-D causes your terminal to flush the input buffer'? Besides, the wikipedia says 'the driver converts a Control-D character at the start of a line into an end-of-file indicator.' Is it wrong? – Tony Aug 23 '14 at 01:52
  • 1
    Seems to be accurate. If you run `cat`, type some text without hitting enter, then press ^D, `cat` will immediate echo what you've typed so far. – nobody Aug 23 '14 at 03:46
  • Unfortunately, no. I found [this on Wikipedia](https://en.wikipedia.org/wiki/End-of-transmission_character), although without a reference. Your quote from Wikipedia isn't wrong (and doesn't contradict me), it's just incomplete. It doesn't say what happens if `^D` is pressed not at the beginning of a line. – mafso Aug 23 '14 at 09:54
  • [POSIX](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html): _If a utility using the escaped convention detects an end-of-file condition immediately after an escaped , the results are unspecified._ Not that helpful, but `^D` seems to be POSIX-specified. Couldn't find it, though. This questions comes up here regularly, and all duplicates I've found don't cite a source either (and most don't mention at all what happens when `^D` is pressed not at the beginning of a line). Would be nice if we found something where this comes from/where this is specified. – mafso Aug 23 '14 at 09:58
  • Yuo cannot `send an EOF`. Your terminal sends an EOD and the terminal-driver interprets that specially and closes the input stream. – wildplasser Aug 23 '14 at 12:33
  • As I said: it is similar, except normally ^C well generate a signal, while ^D will close the stream. In both cases, the control character is intercepted and will not be seen by the program. – wildplasser Aug 23 '14 at 13:10
  • @wildplasser: EOF doesn't necessarily close the stream (see POSIX quote in the post). OP knows that EOF on the beginning of a line closes the stream, the question is what happens when it doesn't (and I think OP knows that not an EOF character is sent to `getline`, but `getline` checks for the end-of-file condition). But I agree that my answer could be improved. I'll edit soon. – mafso Aug 23 '14 at 13:19
  • 1
    I think we agree. See my `./a.out – wildplasser Aug 23 '14 at 13:28
  • Sorry for being so late. Good answer. One more question: what is the difference of EOF, EOT, EOD? – Tony Sep 01 '14 at 11:53
  • @Tony: I don't know of EOD ([Wikipedia](https://en.wikipedia.org/wiki/ASCII#ASCII_control_code_chart) doesn't list it as an ASCII character), maybe a typo or a different name for EOT. EOT is part of ASCII (0x04), EOF isn't an ASCII character, but a character in the sense POSIX uses the term (that's what Wildplasser called confusing). A POSIX terminal converts EOT to EOF (if not preceded by ^V). – mafso Sep 01 '14 at 12:13