If you have an MMU, you can get well-defined safe behaviour: stack overflow causes an invalid page fault (Segmentation Fault on POSIX). This takes some help from the OS, or manually mapping a read-only page below the stack growth limit. Just make sure you touch each page of stack space as you grow the stack. (Or one probe per 64kiB if you reserve more guard space). You can catch SIGSEGV if you want in a POSIX OS. Other OSes may have different mechanisms.
GCC -fstack-check
does this fairly cheaply, in combination with the OS having a "guard region" of unmapped pages below the stack mapping. (Or more specifically, below the max growth limit for the stack, so the stack can still grow, but not past that guard region.)
A 1MiB guard region (current Linux default) is normally enough that you don't even need stack probes to prevent stack clash bugs where the stack overlaps with a dynamic allocation below the stack. But a buggy / vulnerable program that uses an unchecked user-input as a size for an alloca or C99 VLA could skip all the way over the guard region.
And Windows always requires "stack probes" (touching memory in every 4kiB page for large or variable-size stack growth, just like gcc -fstack-protector
does). Windows requires this to even trigger stack growth at all; it won't grow your stack if you touch multiple pages below the last used stack page.
Linux process stack overrun by local variables (stack guarding) has more details.
Stack probes are an essentially foolproof way to make sure your program segfaults by touching an unmapped page (which won't trigger stack growth) before it does anything dangerous. This can work on any OS and any ISA with an MMU.
The total runtime cost is just a loop on function entry (and on every alloca or scope that includes VLAs) that touches memory with a 4kiB stride until it covers the distance of stack growth. If that size is known at compile time, it can be fully unrolled / peeled to just one or a couple instructions.
Or in most functions that only have a few locals not including any huge or variable size array, no overhead at all. Making another function call involves writing to stack memory to save a return address, either as part of x86 call
, or on function entry for RISC ISAs that pass a return address in a link register. So even a whole chain of functions that allocate small to medium arrays and don't touch them can't sneak the stack pointer past the guard page. Saving/restoring the return address to/from the stack is effectively a probe.