3

I wrote a program that will read from and output the contents of its own source file. My purpose is primarily just learning how to use I/O streams and the "FILE" type. I wrote the program in a plain text document on Linux Ubuntu 14.04 and used the terminal to compile and run the program. This is the content of the terminal from compilation to finish:

joseph@ubuntu:~/Desktop$ gcc test.c
joseph@ubuntu:~/Desktop$ ./a.out

File Opened

#include<stdio.h>
#define fileLocation ("/home/joseph/Desktop/test.c")
#define MAXREAD 1000

int main(void)
{
    char fileContents[MAXREAD];
    int i;

    FILE *tf;
    tf = fopen(fileLocation, "r");

    printf("File Opened\n");
    for(i=0;fileContents[i] != EOF; i++)
    {
        fileContents[i] = fgetc(tf);
        printf("%c", fileContents[i]);
    }
    fclose(tf);
    printf("\nFile Closed\n");
    return 0;
}
************************************************************

File Closed

The * symbols are actually the Unicode (0+FFFD: Replacement Character) but I can't seem to type that.

My question is, why doesn't it end the program at the final closing curly brace and instead, print a bunch of replacement characters?

  • Is the file size < 1000 bytes? I estimate about 450 bytes unless some interesting encoding was used. – chux - Reinstate Monica Jun 17 '14 at 20:05
  • Close, the file size is 381 bytes. – Magister Ludi Jun 17 '14 at 20:25
  • The `fileContents` buffer is unnecessary here - you get a character, you output a character - no buffer required, and it will be overrun if the file were > 1000 bytes. `char ch = fgetc(tf); printf( "%c", ch ) ;` will suffice. If you do use a buffer, you could read all at once with `fread()`. – Clifford Jun 17 '14 at 20:45
  • @Clifford `fgetc(tf)` returns 257 different values. 256 of them imply a `char` was read and needs printing. One of them, `EOF` implies end-of-file/IO-error. Hence the need to save the results in an `int` and not `char`. – chux - Reinstate Monica Jun 17 '14 at 21:47
  • "why doesn't it end the program at the final closing curly brace and instead, print a bunch of replacement characters?" is only answered by @WhozCraig. – chux - Reinstate Monica Jun 17 '14 at 21:49
  • This has been very helpful, @chux You are right about the best answer, I made the correction. This topic can be closed. – Magister Ludi Jun 17 '14 at 22:03
  • I noticed that you are checking the content of the fileContents[i] byte before it is read into. And the array fileContents[i] is not cleared to some known value, so the check at the top of the for loop is checking the garbage in the fileContents. I.E. the loop will not stop until it picks up a random EOF character in memory. – user3629249 Jun 18 '14 at 01:43
  • @chux : Agreed - my point was not intended to be a solution to the problem described (it is a comment not an answer), but was specifically about the buffer being unnecessary, and overrun being a danger. The code has multiple issues besides. – Clifford Jun 18 '14 at 06:30
  • @MagisterLudi : No, it won't be closed unless it is in appropriate for SO. That is not how SO works. Besides you don't know that some even better answer might not be posted. – Clifford Jun 18 '14 at 06:32

1 Answers1

3

The order in your loop is incorrect. You should be checking for EOF before storing and printing your character value. You should also ensure you're not overstepping the array boundaries.

int main(void)
{
    char fileContents[MAXREAD];
    int i, c;

    FILE *tf = fopen(fileLocation, "r");
    if (tf == NULL)
    {
        perror(fileLocation);
        return EXIT_FAILURE;
    }

    printf("File Opened\n");
    for (i=0; i < MAXREAD && (c = fgetc(tf)) != EOF; ++i)
    {
        fileContents[i] = c;
        fputc(fileContents[i], stdout);
    }
    fclose(tf);
    printf("\nFile Closed\n");
    return 0;
}

Your version of the code includes printing an incorrectly-stored EOF in a char (which is itself another issue, but avoided by not storing it in the first place). But that is far form the end of your woes. Your conditional logic for continuing your for-loop is wrong. In fact, since you never initialize fileContents[], it actually invokes undefined behavior. With each iteration you're checking an array slot you haven't yet written, nor initialized. Read on for how/why.

Why do you keep printing?

The control expression, fileContents[i] != EOF, is evaluated before each loop iteration. The increment expression, i++, executes after each iteration, but before the next evaluation of the control conditional. From the standard:

The statement

for ( clause-1 ; expression-2 ; expression-3 ) statement

behaves as follows: The expression expression-2 is the controlling expression that is evaluated before each execution of the loop body. The expression expression-3 is evaluated as a void expression after each execution of the loop body. If clause-1 is a declaration, the scope of any identifiers it declares is the remainder of the declaration and the entire loop, including the other two expressions; it is reached in the order of execution before the first evaluation of the controlling expression. If clause-1 is an expression, it is evaluated as a void expression before the first evaluation of the controlling expression.

Putting it bluntly, the EOF you just saved in fileContents[i] is never checked, because i is incremented before the next evaluation. That makes sense from the above description. It is the very reason the simple loop:

for (i=0; i<N; ++i)
    dostuff;

exits with i < N being false. Barring unforeseen modification in dostuff, the loop will terminate with i = N.

Again, the eval is done after the increment step, and as such in your case:

for(i=0; fileContents[i] != EOF; i++)

The control expression fileContents[i] != EOF is evaluated before each entry into the loop body. The increment expression happens after the loop body, but before the next evaluation of the control-expression. Within your loop body you store EOF in the slot indexed with the current value of i. Then the body finishes, i is incremented, and only then do you check a slot that you didn't write anything upon (yet). This continues until some point, if you're (un)lucky, you discover anEOF equivalent value at your newly-updated i index. And thus you terminate (but most likely, you've crashed long before then).

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • I agree with these issues, but unless the array is overrun (UB), why multiple '*'? – chux - Reinstate Monica Jun 17 '14 at 20:22
  • @chux - the buffer was never the issue - it was always unnecessary, a single char is all that's needed for get-a-char ... print-a-char. – Clifford Jun 17 '14 at 20:49
  • @Clifford "a single char is all that's needed for get-a-char ... ". Say `EOF` is -1. How should code distinguish between reading a `char` from `fgets()` (which returns 257 different values) with all bits set like `(char) 255` and `EOF`? – chux - Reinstate Monica Jun 17 '14 at 21:41
  • @WhozCraig I now see about the 'i index is incremented before fileContents[i] != EOF", leading to trouble. So for OP it was simply random junk in the buffer, or past it, that finally stopped the loop. – chux - Reinstate Monica Jun 17 '14 at 21:44
  • Apparently someone disagrees. A reasonable discussion to refute the above would be welcome, but I won't hold my breath. – WhozCraig Jun 17 '14 at 22:27
  • @chux : We don't disagree, my point was about the buffer being unnecessary, and the danger of overrun being trivial to avoid. In the loop, you can print `c` directly, the assignment to `fileContents[i]` serves no purpose. I appreciate that that was from the original question code and you were addressing only the question raised. The comment was not a criticism of the code, but a point aimed at the OP. – Clifford Jun 18 '14 at 06:34
  • My friend told me that one should always use some kind of buffer just because it is a good habit to get into, even if it is only one character. That is why I used it instead of a direct read/print. – Magister Ludi Jun 18 '14 at 17:55
  • @MagisterLudi buffer reads/writes, barring any other reason than just to buffer, is generally double-implementation. You're using the C standard I/O library, which already buffers for you by-default. There *is* a significant function *call* overhead that can be reduced by N for using [`fread()`](http://en.cppreference.com/w/c/io/fread), for example. with an N-sized buffer. You *may* (but by no means are you *guaranteed*) also reduce the function-call overhead by using `getc()`, which *can* be a macro if implemented as-such by your implementation library, but I caution you against side-effects. – WhozCraig Jun 18 '14 at 18:25