4

I was trying to understand some basic code and got slightly confused by following code

int main ()
{
   FILE *fp;
   int c;
   int n = 0;

   fp = fopen("file.txt","r");
   if (fp == NULL)
   {
      perror("Error in opening file");
      return(-1);
   }
   do
   {
      c = fgetc(fp);
      if ( feof(fp) )
      {
         break ;
      }
      printf("%c", c);
   } while(1);

   fclose(fp);
   return(0);
}

Can someone explain me why c is of type integer, even though it is defined by fgetc(fp) which, from my knowledge, gets just the next character?

Paul Roub
  • 36,322
  • 27
  • 84
  • 93
Tony.H
  • 49
  • 1
  • 5
  • 2
    Look again at [`fgetc`](http://en.cppreference.com/w/c/io/fgetc), it returns an `int` (with good reason), so assigning that result to an `int` is also a good idea. – Kninnug Sep 17 '15 at 22:22
  • `c` is of type `int`, not "integer". The word "integer" covers all the integer types, from `char` up to `long long` (and their unsigned variants). – Keith Thompson Sep 17 '15 at 22:46
  • Have a look: http://stackoverflow.com/q/5431941/3185968 – EOF Sep 17 '15 at 22:49

2 Answers2

4

Given the precise way this particular code has been written, c wouldn't need to be of type int--it would still work if c were of type char.

Most code that reads characters from a file should (at least initially) read those characters into an int though. Specifically, part of the basic design of C-style I/O is that functions like getc and fgetc can return EOF in addition to any value that could have come from the file. That is to say, any value of type char could be read from the file. getc, fgetc, etc., can signal the end of file by returning at least one value that won't/can't have come from the file. It's usually -1, but that's not guaranteed. When char is signed (which it usually is, nowadays) you can get a value from the file that will show up as -1 when it's been assigned to a char, so if you're depending on the return value to detect EOF, this can lead to mis-detection.

The code you've included in the question simply copies the contents of the file to standard output, one character at a time. Making use of the EOF return value, we can simplify the code a little bit, so the main loop looks like this:

int c;

while (EOF != (c = fgetc(fp)))
    printf("%c", c); // or: putchar(c);

I suppose some might find this excessively terse. Others may object to the assignment inside the condition of the loop. I, at least, think those are reasonable and valid arguments, but this still fits enough better with the design of the library that it's preferable.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • OP's `c = fgetc(fp); if ( feof(fp) )` well handles the rare situation of `unsigned char` and `int` having the same number of unique values. `while (EOF != (c = fgetc(fp)))` is a common idiom, but not superior to OP's – chux - Reinstate Monica Sep 18 '15 at 01:24
  • @chux: That situation is not merely rare--it's completely unheard of (i.e., a purely theoretical possibility). At the same time, the code in question is significantly more difficult to read (especially for somebody accustomed to C, for whom the idiomatic code is immediately recognizable. In other words, it exchanges a purely theoretical advantage for a completely real disadvantage. – Jerry Coffin Sep 18 '15 at 04:01
  • 1
    Analog Devices 32-bit SHARC DSP. As somebody well accustomed to C, I do not find `c = fgetc(fp); if ( feof(fp) )` significantly more difficult to read. I doubt any recent/mew machine will have `INT_MAX-IINT_MIN == UCHAR_MAX-UCHAR_MIN`. I also doubt any machine will not use 2's complement integer, yet C still maintain UB on `int` overflow. So I code defensively and watch for overflow. I do not denigrate code like `c = fgetc(fp); if ( feof(fp) )` as needing simplification when it is in fact a reasonable alternative nor object to terse Yoda-like code `EOF != (c = fgetc(fp))` – chux - Reinstate Monica Sep 18 '15 at 04:51
  • Out of curiosity, why is it possible to use %c with an `int` in printf? You have to specify a type when pulling arguments out of variadic functions, and I thought that printf expects a `char` upon encountering %c. Wouldn't passing an `int` where it expects a `char` lead to undefined behaviour? Or does it just cast it? – Joshua Perrett May 05 '19 at 15:15
1

The signature of fgetc

int fgetc ( FILE * stream );

And about return value of fgetc

On success, the character read is returned (promoted to an int value). The return type is int to accommodate for the special value EOF, which indicates failure.

So we declare variable as integer. As character may be unsigned which can't take negative value. But implementation of EOF always negative or very commonly -1 which can be use as failure or end of file. So it is safe to use integer and can be use like

int c;
while ((c = fgetc(fp))!=EOF){
   //Code goes here
}
ashiquzzaman33
  • 5,781
  • 5
  • 31
  • 42
  • 1
    Whether `char` is signed or unsigned is implementation-defined (C11§6.2.5/15). `EOF` is always negative (C11§7.21.1/3). – Kninnug Sep 17 '15 at 22:38
  • Yes, so it is safe to use integer. – ashiquzzaman33 Sep 17 '15 at 22:39
  • "As character may be unsigned which can't take negative value." is true, but an `unsigned char` converted to an `int` _can_ have a negative value. Consider `unsigned char` and `int` both using 32-bit. Legal in C, but rare. OP's `c = fgetc(fp); if ( feof(fp) )` is more robust than `int c; while ((c = fgetc(fp))!=EOF){` – chux - Reinstate Monica Sep 18 '15 at 01:21
  • FWIW, I've got a different quote. My manpage says: "`fgetc()`, `getc()` and `getchar()` return the character read as an _unsigned char_ cast to an _int_ or `EOF` on end of file or error" and I use `getchar` in that way. Values from 0 to 255 indicate valid input, the special value `EOF` indicates the end of the file. `EOF` is negative, so it is distinct from valid input. – M Oehm Sep 18 '15 at 07:56
  • 1
    @chux: Your comment is being discussed here: http://stackoverflow.com/q/32646556/908515 – undur_gongor Sep 18 '15 at 08:00
  • 1
    @M Oehm `unsigned char` is not limited to 0 to 255, but to `UCHAR_MAX` - although that is certainly common to be 255. It is not even limited to 0 to `INT_MAX`. In that rare case, a conversion of `unsigned char` to `int` may result in a negative value equal to `EOF`. Man pages reflect a large segment of C, but is not the C spec. – chux - Reinstate Monica Sep 18 '15 at 14:35