2

I have a simple C file I/O program which demonstrates reading a text file, line-by-line, an outputting its contents to the console:

/**
* simple C program demonstrating how
* to read an entire text file
*/

#include <stdio.h>
#include <stdlib.h>

#define FILENAME "ohai.txt"

int main(void)
{
    // open a file for reading
    FILE* fp = fopen(FILENAME, "r");

    // check for successful open
    if(fp == NULL)
    {
        printf("couldn't open %s\n", FILENAME);
        return 1;
    }

    // size of each line
    char output[256];

    // read from the file
    while(fgets(output, sizeof(output), fp) != NULL)
        printf("%s", output);

    // report the error if we didn't reach the end of file
    if(!feof(fp))
    {
        printf("Couldn't read entire file\n");
        fclose(fp);
        return 1;
    }

    // close the file
    fclose(fp);
    return 0;
   }

It looks like I've allocated an array with space for 256 characters per line (1024 bytes bits on a 32-bit machine). Even when I fill ohai.txt with more than 1000 characters of text on the first line, the program doesn't segfault, which I assumed it would, since it overflowed the allocated amount of space available to it designated by the output[] array.

My hypothesis is that the operating system will give extra memory to the program while it has extra memory available to give. This would mean the program would only crash when the memory consumed by a line of text in ohai.txt resulted in a stackoverflow.

Could someone with more experience with C and memory management support or refute my hypothesis as to why this program doesn't crash, even when the amount of characters in one line of a text file is much larger than 256?

ironicaldiction
  • 1,200
  • 4
  • 12
  • 27
  • 2
    Your hypothesis is wrong. You should carefully re-read the documentation of `fgets`, and especially the significance of its second /*parameter*/ argument. This should shed some light ! – Nbr44 Aug 27 '13 at 16:18
  • Will do...and do you mean second _argument_, to be sure? – ironicaldiction Aug 27 '13 at 16:19
  • @ironicaldiction Yeah, second *argument,* definitely. –  Aug 27 '13 at 16:20
  • Yes, argument - I'm unsure which term is the proper one in english – Nbr44 Aug 27 '13 at 16:20
  • If you overwrite memory that you didn't intend, it *might* segfault depending upon which memory it is. It's not a certainty. But it *is* certainly always bad to overwrite the memory you did not intend. :) – lurker Aug 27 '13 at 16:23
  • Thanks, just making sure, and my confusion comes from this in the docs: "The `fgets()` function reads at most one less than the number of characters specified by size from the given stream and stores them in the string str." To me, this seems to imply an upper-bound on the amount it can read. I realize that `gets` has an infinite size, however. – ironicaldiction Aug 27 '13 at 16:24
  • @mbratch, I don't think it's possible to overwrite memory I didn't intend to touch in this case, because [arrays seem to be allocated on the stack in C](http://stackoverflow.com/questions/12874604/c-array-instantiation-stack-or-heap-allocation), which would cause a segfault before I touched any memory erroneously. – ironicaldiction Aug 27 '13 at 16:29
  • @ironicaldiction, it is possible if the memory you didn't intend to write belongs to your application. For example, you might have other variables on the stack which get overwritten. – lurker Aug 27 '13 at 16:30
  • Ah, I see your point. Thanks for the heads up @mbratch. – ironicaldiction Aug 27 '13 at 16:31
  • 1
    also sizeof(char) == 1 acording to the C standard and on most systems a char is one byte, not 4. Second the way the stack works is not the way you suppose it does. In this case you'd have to overflow your buffer by quite a lot before getting a segfault. Basically you get some number of pages for your stack and you won't get a segfault unless you go beyond that, the program doesn't allocate more pages on every function call so the entirety of the possible stack size is writeable. Your array is going towards the nearer end so you could write up to a couple stack frames before getting a segfault – Spudd86 Aug 27 '13 at 17:16
  • +1 for the clear explanation @Spudd86 – ironicaldiction Aug 27 '13 at 17:19

4 Answers4

6

You're not overflowing anything here: fgets won't write more than sizeof(output) characters to the buffer, and therefore will not overflow anything (see the documentation).

However, if you do overflow a buffer, you get undefined behaviour. According to the C spec, the program may do anything: crash, not crash, silently destroy important data, accidentally call rm -rf /, etc. So, don't expect a program to crash if you invoke UB.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • Can you explain why I'm not overflowing anything, even if I have more than 254 characters on a line of text in the file from which the program reads? I apologize if I'm missing something obvious here. – ironicaldiction Aug 27 '13 at 16:53
  • 254 should be 256 above – ironicaldiction Aug 27 '13 at 17:01
  • 1
    You don't overflow anything **because you told `fgets` not to**. You should look at the `fgets` documentation linked in my answer to see exactly how it works; then you'll see that you cannot possibly overflow the buffer with the way you called it. – nneonneo Aug 27 '13 at 17:40
  • Ah, okay. If I'm not mistaken, `fgets` will read in 256 bytes as many times as it takes to traverse the whole file, because it is contained within a `while` loop. Thus, it continuously chews away at the problem, 256 bytes at a time, until all the char bytes in the file pointed to by `fp` are read (at which point `NULL` is encountered). Thanks a lot @nneonneo – ironicaldiction Aug 27 '13 at 18:51
2

OP's program did not crash because no buffer overflow occurred.

while(fgets(output, sizeof(output), fp) != NULL)
  printf("%s", output);

The fgets() nicely read a group of char up to a count or 255 or a \n. Then printf("%s" ... nicely printed them out. This repeated until no more data/

No crash, no overflow, no runs, no hits , no errors.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

fgets(output, sizeof(output), fp) reads (sizeof(output) -1) number of characters in this case(otherwise it reads till newline or end of file)

  • My problem is with "otherwise it reads till newline or end of file". How does it "decide" to read until newline or end of fileinstead of only reading `sizeof(output - 1)` maximally? – ironicaldiction Aug 27 '13 at 16:33
  • 1
    @ironicaldiction It has it's own internal buffering (there's a pointer to it inside the FILE * object) ie: it reads data into it's own buffer and might actually read past the end of the line but as it copies into your array it looks for the newline and stops when it finds it, and the FILE * keeps track of how much data is in the buffer for subsequent reads. – Spudd86 Aug 27 '13 at 17:21
0

Explanation of stacks and why this might not segfault even if you actually did overflow (and as others have pointed out the code as written will not)

Your stack pointer starts at some address say 0x8000000 then the runtime calls main and it'll move down a bit (there may be other stuff up there so we don't know how much stuff is on the stack at the start of main), then main will move the stack pointer some more for all it's local variables. So at this point your array will have an address that is more than 256 bytes below 0x8000000 and you won't get a segfault unless you run all the way over all of main's stack frame and the stack frames of whatever other C runtime stuff called main.

So for the sake of simplicity assume your array ends up with it's base address at 0x7fffd00 that's 768 bytes below 0x8000000 meaning at a minimum you'd have to overflow by that much to get a segfault, (well you'd probably get a segfault when main returns or when you call feof, because you filled your stack frame with random characters, but we're talking about segfaults inside fgets()) but even that's not gaurenteed if something writable is mapped to the page above your stack (unlikely most OSs avoid doing that so you'll get a segfault if you overflow far enough)

If the stack runs the other way (ie: growing upward) you'd have to run over the entirety of the maximum size stack, which in userspace is usually quite large (Default on Linux for 32bit x86 is 2MB) but I'm pretty sure x86 stacks grow downward so that's not likely for your case.

Spudd86
  • 2,986
  • 22
  • 20