As far as I can tell, there is no fundamental reason why StackOverflowException
should be uncatchable. And yet it is.
A thread's stack has a maximum size which defaults to 1MB or 4MB depending on bitness, but this space is merely reserved. It's not committed (thus using only virtual address space) until something actually requires that much stack.
As far as I know, the basic idea behind why it's uncatchable is that by then you've run out of all the stack and thus can't really execute anything other than code very carefully designed for the out-of-stack condition. But this begs the obvious solution: throw StackOverflowException
before we have run out of all the stack space!
Why isn't this done? The only remotely sensible reason I can think of is that it was decided that the extra virtual address space that this would consume (not real memory usage, mind you!) is not worth the benefit of making this exception catchable.
Are there other concerns I haven't considered?
I feel I have to ask, because most answers addressing this on SO seem to imply that it's impossible to get this to work reasonably well, and that is why .NET makes them uncatchable.
The exact implementation details for the catchable variant seem unimportant, but here's just one idea. First, double the default reserved stack size, but set the exception to trigger at 1MB usage. So far this is exactly the same as the existing approach. But when the threshold is reached, instead of throwing an uncatchable exception, throw a catchable one while increasing the threshold to 1.5MB. If the exception is caught and we blow through the limit again, set it to 1.75MB. Then 1.875MB. Etc. Each nested and handled exception gets an ever decreasing amount of extra stack space to get handled until we get close enough to 2MB to require an uncatchable variant to be thrown.
To decrease the threshold back after a successful handling of the StackOverflowException, let's mark the memory page at half the stack size that we just blew through (so 0.5MB in the first instance) to fault on write. The fault handler will be triggered pretty much exclusively when we're back down to that level of stack usage, so it's not expensive. The handler will check the actual stack size and drop the threshold back if appropriate.