0

Why does the following C code allow an input stream that is greater than the size of the buffer before it finally results in a segmentation fault? With a character array of size 1 as very_bad's argument, wouldn't this only allow 1 character of input?

Additionally, what are the implications of initializing c as an int and not a char? Would the statement while ((c = getchar()) != '\n' && c != EOF) be reading from getchar() in 4 character increments? Or would the binary representation of the characters be in 4 bytes?

Note: this is text from class, but not HW.

#include <stdio.h>

void very_bad(char* buffer) {
  int c;  /* NB: int rather than char because the system I/O function
         getchar() returns the int value 0xffffffff on end-of-file */
  char* dest = buffer;

  /* read until newline or end-of-file (Control-Z for the standard input) */
  while ((c = getchar()) != '\n' && c != EOF)
    *dest++ = c; /* store next character in buffer and move pointer */

  printf("%s\n", buffer);
}

int main() {
  char buff[1];  /* big trouble waiting to happen */
  very_bad(buff);

  return 0;
}
user1185790
  • 623
  • 8
  • 24
  • Area of the gap exists for alignment. – BLUEPIXY Oct 10 '14 at 12:58
  • in one word: UB - http://en.wikipedia.org/wiki/Undefined_behavior – Karoly Horvath Oct 10 '14 at 12:59
  • 1
    The rationale for `int` is explained in the comment. `getchar` reads a single byte and returns its unsigned representation as int in the range 0 to 255. It returns the special value `EOF`, which is -1, to indicate that the stream has run out. A char can't store all valid chars plus one extra value, hence `getchar` uses `int`. (In fact, using `char` might make you miss the end of the file.) – M Oehm Oct 10 '14 at 13:55

1 Answers1

1

For the first part of your question,it has to do with page size in OS. And ofcourse, that code causes "undefined behavior".

This answer gives a really good idea: why doesn't my program crash when I write past the end of an array?

Additionally, what are the implications of initializing c as an int and not a char? Would the statement while ((c = getchar()) != '\n' && c != EOF) be reading from getchar() in 4 character increments? Or would the binary representation of the characters be in 4 bytes?

The protoype of getchar() is:

int getchar ( void );

So, when you use c = getchar(), each call to getchar() will read one character from STDIN, convert that char value to int and return it, to assign to c variable.

Community
  • 1
  • 1
askmish
  • 6,464
  • 23
  • 42