34

In a compiled program (let's say C or C++, but I guess this question could extend to any non-VM-ish language with a call stack) - very often when you overflow your stack, you get a segmentation fault:

Stack overflow is [a] cause, segmentation fault is the result.

Is this always the case, though? Can a stack overflow result in other kinds of program/OS behavior?

I'm asking also about non-Linux, non-Windows OSes and non-X86 hardware. (Of course if you don't have hardware memory protection or OS support for it (e.g. MS-DOS) then there's no such thing as a segmentation fault; I'm asking about cases where you could get a segmentation fault but something else happens).

Note: Assume that other than the stack overflow, the program is valid and does not try to access arrays beyond their bounds, dereference invalid pointers, etc.

ApproachingDarknessFish
  • 14,133
  • 7
  • 40
  • 79
einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • 6
    You can jump over the guard page and hit another mapped region. –  Jun 06 '18 at 19:44
  • 4
    It could cause your program to branch to an invalid instruction – cleblanc Jun 06 '18 at 19:44
  • 8
    The program _could_ behave correctly and as expected. – Drew Dormann Jun 06 '18 at 19:45
  • 2
    Is highly OS related. – Stefan Jun 06 '18 at 19:45
  • 6
    https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt – melpomene Jun 06 '18 at 19:46
  • @Ivan: You mean, if your stack frame is larger than a single page? – einpoklum Jun 06 '18 at 19:46
  • 1
    @einpoklum Yes. You can even return back successfully after corrupting some other page. –  Jun 06 '18 at 19:47
  • 7
    The answer, essentially, is "anything." Sure, most (all?) common modern OS's have guard pages, but there's nothing that guarantees one. Once you're accessing random memory, almost anything can happen. You might, in the absence of read-only code pages or execute protection bits, create gibberish code or try to execute non-code memory. You might corrupt a function pointer stored by some other part of the program, causing it to jump somewhere else. And so on... – Linuxios Jun 06 '18 at 19:47
  • 2
    In our OpenMP program, it once caused silent overwriting of data of some other data structures. The program then finished correctly, just with incorrect results, and it was not easy to figure out the cause. (Setting `OMP_STACKSIZE` resolved the problem.) – Daniel Langr Jun 06 '18 at 19:47
  • @Stefan: I understand that could be the case; if it is, an answer would be an example of an OS where this _always_ happens and an OS with which something _else_ can happen. – einpoklum Jun 06 '18 at 19:47
  • 1
    Of course, *visting* Stack Overflow often leads to *fixing* your segmentation fault... :) – Linuxios Jun 06 '18 at 19:48
  • 1
    @Linuxios: It may result in brain stack overflow though, causing another kind of segmentation fault... :-) – einpoklum Jun 06 '18 at 19:49
  • @einpoklum: yes, I understand. MS-DOS was the first to pop into my mind, but I see you excluded that one ;-) It's a bit tricky since there are a lot of os-es... so... there must be a faulty one somewhere.... Asking if something else *can* happen is like asking if *all* are working properly, which I think will not be the case. I know, it lacks an example. – Stefan Jun 06 '18 at 19:52
  • 3
    Undefined Behaviour is undefined. Anything could happen, all bets are off and you've left the land of guaranteed behaviour for *anything*. – Jesper Juhl Jun 06 '18 at 20:04
  • 2
    Have anyone mentioned [nasal demons](http://www.catb.org/jargon/html/N/nasal-demons.html) already? – Eugene Sh. Jun 06 '18 at 20:05
  • 1
    This classic comes to mind: [Smashing The Stack For Fun And Profit](http://insecure.org/stf/smashstack.html) – Jesper Juhl Jun 06 '18 at 20:08

4 Answers4

31

Yes, even on a standard OS (Linux) and standard hardware (x86).

void f(void) {
    char arr[BIG_NUMBER];
    arr[0] = 0; // stack overflow
}

Note that on x86, the stack grows down, so we are assigning to the beginning of the array to trigger the overflow. The usual disclaimers apply... the exact behavior depends on more factors than are discussed in this answer, including the particulars of your C compiler.

If the BIG_NUMBER is just barely large enough to overflow, you will run into the stack guard and get a segmentation fault. That's what the stack guard is there for, and it can be as small as a single 4 KiB page (but no smaller, and this 4 KiB size is used prior to Linux 4.12) or it can be larger (1 MiB default on Linux 4.12, see mm: large stack guard gap), but it is always some particular size.

If BIG_NUMBER is large enough, the overflow can skip over the stack guard and land on some other piece of memory, potentially memory that is valid. This may result in your program behaving incorrectly but not crashing, which is basically the worst-case scenario: we want our programs to crash when they are incorrect rather than do something unintended.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • 5
    Out of curiosity: do any of the standard compilers emit a warning on code like this? Some "Stack-allocated local variable ... likely to overflow..."? – Linuxios Jun 06 '18 at 19:52
  • 2
    Wouldn't (isn't) that (be) a big potential exploit? – Stefan Jun 06 '18 at 19:53
  • @melpomene: lol, I up-ed that one a minute ago and putted it on my reading list. – Stefan Jun 06 '18 at 19:56
  • @Linuxios: I remember seeing warnings for large stack frame sizes back in the 1990s (maybe with MrC or Metrowerks) but for the life of me I can't get GCC or Clang to emit warnings for the obviously dangerous function written above. – Dietrich Epp Jun 06 '18 at 19:58
  • @DietrichEpp: Weird. It would seem reasonably trivial to sum the sizeof's of a functions stack variables and just emit a warning if it's more than a worst-case guard page size. Maybe there's a missing subtlty here? Or maybe GCC and Clang are just assuming no one would write code like this... – Linuxios Jun 06 '18 at 20:01
  • 7
    @Linuxios MSVC inserts `_chkstk` call in functions whose frame is larger than a single page. Its purpose is to avoid segfaults during stack expansion (Windows specific thing), but afair it also detects stack overflow and raises Structured Exception. This should guarantee that stack overflow always leads to an exception and avoids silent data corruption. –  Jun 06 '18 at 20:01
  • @Ivan: Huh, interesting. After reading bits of the link from melporneme, it looks like there is an option for GCC that does something similar, writing a byte to every page in a stack expansion, ensuring that the guard page is hit and a segfault occurs. – Linuxios Jun 06 '18 at 20:12
  • 3
    Since stack allocation/size is a linker/loader responsibility, a compiler cannot easily/reliably emit warnings/errors. – Martin James Jun 06 '18 at 20:21
  • @einpoklum: I am unhappy with the choice of emphasis in your edit. – Dietrich Epp Jun 06 '18 at 21:59
  • @DietrichEpp: Fair enough, it's your answer. – einpoklum Jun 06 '18 at 22:10
  • I don't understand, why is this a stack overflow? – BlueMoon93 Jun 14 '18 at 10:44
  • @BlueMoon93: Local variables are stored in a part of memory called the stack. If you put too much data there, you get a stack overflow. – Dietrich Epp Jun 14 '18 at 17:38
6

One thing is what happens at runtime when you overflow the stack, which can be many things. Including, but not limited to; segmentation fault, overwriting variables following whatever you overflow, causing an illegal instruction, nothing at all and much more. The "old" classic paper Smashing The Stack For Fun And Profit describes a lot of ways one can have "fun" with this stuff.

Another thing is what can happen at compile time. In both C and C++, writing beyond an array or exceeding the size of the stack is Undefined Behaviour and when a program contains UB anywhere the compiler is basically free to do whatever it wants to any part of your program. And modern compilers are becoming very aggressive in exploiting UB for optimization purposes - often by assuming that UB never happens, leading them to simply remove the code containing UB or causing a branch to always or never be taken because the alternative would cause UB. Sometimes the compiler will introduce time travel or call a function that was never called in the source code and many, many other things that can cause really confusing run-time behaviour.

See also:

What Every C Programmer Should Know About Undefined Behavior #1/3

What Every C Programmer Should Know About Undefined Behavior #2/3

What Every C Programmer Should Know About Undefined Behavior #3/3

A Guide to Undefined Behavior in C and C++, Part 1

A Guide to Undefined Behavior in C and C++, Part 2

A Guide to Undefined Behavior in C and C++, Part 3

Jesper Juhl
  • 30,449
  • 3
  • 47
  • 70
  • I wasn't talking about overstepping array bounds, just perfectly valid program behavior except for exceeding the space allocated for the stack. Let me clarify that in the question. – einpoklum Jun 06 '18 at 20:47
  • @einpoklum exceeding the size of the stack is *not* valid program behaviour and would equally result in UB. – Jesper Juhl Jun 06 '18 at 20:50
  • 2
    One of the ways (and the most common one) to overflow a stack *is* to access a stack allocated array out of its bounds. – Eugene Sh. Jun 06 '18 at 20:52
  • @JesperJuhl: When you write your program, you don't know what the size of the stack is going to be. – einpoklum Jun 06 '18 at 20:56
  • @einpoklum Just because you don't know what the (available) size will be does not make it legal to exceed it. And what happens if you do is undefined. – Jesper Juhl Jun 06 '18 at 20:58
  • 2
    @einpoklum You mean the space allocated for stack, or the usage? The former one you can know for sure. The latter can be bounded or estimated. – Eugene Sh. Jun 06 '18 at 20:58
4

Other answers have covered the PC side fairly well. I'll touch on some of the issues in the embedded world.

Embedded code does have something similar to a segfault. Code is stored in some kind of non-volatile storage (usually flash these days, but some kind of ROM or PROM in the past). Writing to this needs special operations to set it up; normal memory accesses can read from it but not write to it. In addition, embedded processors usually have large gaps in their memory maps. If the processor gets a write request for memory which is read-only, or if it gets a read or write request for an address which does not physically exist, the processor will usually throw a hardware exception. If you have a debugger connected, you can check the state of the system to find what went wrong, as with a core dump.

There's no guarantee that this will happen for a stack overflow though. The stack can be placed anywhere in RAM, and this will usually be alongside other variables. The result of stack overflow will usually be to corrupt those variables.

If your application also uses heap (dynamic allocation) then it is common to assign a section of memory where the stack begins at the bottom of that section and expands upwards, and the heap begins at the top of that section and expands downwards. Clearly this means dynamically-allocated data will be the first casualty.

If you're unlucky, you may not even notice when it happens, and then you need to work out why your code is not behaving correctly. In the most ironic case, if the data being overwritten is a pointer then you may still get a hardware exception when the pointer tries to access invalid memory - but this will be some time after the stack overflow and the natural assumption will usually be that it's a bug in your code.

Embedded code has a common pattern to deal with this, which is to "watermark" the stack by initialising every byte to a known value. Sometimes the compiler can do this; or sometimes you may need to implement it yourself in the startup code before main(). You can look back from the end of the stack to find where it is no longer set to this value, at which point you know the high-water-mark for stack usage; or if it is all incorrect then you know you have an overflow. It is common (and good practise) for embedded applications to poll this continuously as a background operation, and to be able to report it for diagnostic purposes.

Having made it possible to track stack usage, most companies will set an acceptable worst-case margin to avoid overflows. This is typically somewhere from 75% to 90%, but there will always be some spare. Not only does this allow for the possibility that there is a worse worst-case which you have not yet seen, but it also makes life easier for future development when new code needs to be added which uses more stack.

Graham
  • 1,655
  • 9
  • 19
2

Stackoverflow is one of the many reasons for undefined behavior of a program. In this case you can get an expected result or segmentation fault or your hard disk could be erased, etc. Do not expect any defined behaviour because it's undefined behaviour.

haccks
  • 104,019
  • 25
  • 176
  • 264
  • On Windows with MSVC it is very well defined. –  Jun 06 '18 at 22:56
  • MSVC does not follow c standard. – haccks Jun 06 '18 at 23:03
  • OP wasn't asking about standard C. Also standard C is meaningless because programs are not written in "standard C". –  Jun 06 '18 at 23:41
  • @Ivan; Can you provide some quote where it is well defined in MSVC? – haccks Jun 07 '18 at 11:34
  • @Ivan I don't have any link on me, but MSVC emits `_chkstk` call for functions with large frame. With guard page and emergency guard page this makes it possible to handle stack overflow. Unlike other platforms there is a guarantee that you will get structured exception, not random data corruption. –  Jun 07 '18 at 11:38