1

I was trying to read a file into a string with the following code. I assigned 5 bytes for the char *a and actually read a file with more than 5 chars. However, the output still print the correct file contents without any garbage value or missing value.

#include <stdio.h>
#include <stdlib.h>

#define INPUT_SIZE 5

int main() {

        char *a = malloc(INPUT_SIZE);
        FILE *fp = fopen("text", "r");
        if (fp == NULL) {
                perror("Unable to open the file");
        }

        char *b = a;
        char c;
        int i = 0;
        while ((c = fgetc(fp)) != EOF) {
                *b++ = c;
        }

        printf("%s", a);
        free(a);
        fclose(fp);
        return 0;
}

The input file is

abc
def
g

And the output is exactly the same as the input file.
Since normally there should be a '\0' at the end of the char * to show where the string end. But in this occasion, there is no explicit '\0' in the char *a. So I wonder if there is an '\0' at the end of the file which was read as the last char ?

Zcy
  • 75
  • 9
  • No, you are triggering Undefined Behaviour by accessing out of bounds memory. In C, UB means that the result is unpredicatable - behaviour can be appearing to "work", crash, wrong results, anything else. – kaylum Apr 28 '20 at 04:44
  • Does this answer your question? [Array index out of bound in C](https://stackoverflow.com/questions/671703/array-index-out-of-bound-in-c) – kaylum Apr 28 '20 at 04:46
  • @kaylum Thank you for your asnwer, but that post does not really answered my question. I tried to modify the size of the `char *`, less than needed, same as needed and more than needed. However, they all gave me the output same as the file. Without garbage value, without value missing. Just curious why the output is always correct. – Zcy Apr 28 '20 at 04:54
  • There are many causes of undefined behaviour. Telling `printf` that `a` is a string when it is not is one of them. So it may appear to "work" today but may not tomorrow, or on a different OS, or with a different compiler, etc. Bottom line - if your code has such bugs you need not try to explain it further as it is undefined. – kaylum Apr 28 '20 at 04:55
  • 1
    For your further reading: [Undefined, unspecified and implementation-defined behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior) – kaylum Apr 28 '20 at 04:58

1 Answers1

2

This is a situation where the results may look correct, but you are simply getting "lucky" with the output of your program.

First, when you call malloc(INPUT_SIZE), your implementation of libc will generally not allocate just 5 bytes, but actually some multiple of 8 bytes (like 16 or 32, depends on the platform [see unexpected output of size allocated by malloc in C). This extra data contains possible padding bytes after your data and metadata before and after your requested block. This is done for alignment and bookkeeping purposes, but the takeaway is that you get more than you ask for when you call malloc.

You should not take advantage of this implementation detail to fit more data into a malloced region that you requested as that space is not really yours for the taking. By writing past the end of your buffer, you risk scribbling on important data that your allocator needs to ensure consistency.

Second, the null terminator behavior you are seeing is simply you getting lucky and receiving a zeroed out section of memory from malloc. This is not always guaranteed, and the next time you run the program, your buffer could come back from malloc filled with random values, instead of 0. If you want pre-zeroed memory, use calloc instead.

So to answer the question, no there is not a null terminator at the end of files, your program is just using undefined behaviors of the standard library to make it look like there is.

Brian Tracy
  • 6,801
  • 2
  • 33
  • 48