11

I'm architecting a small software engine and I'd like to make expensive use of the stack for rapid iterations of large number sets. But then it occurred to me that this might be a bad idea since the stack isn't as large a memory store as the heap. But I am attracted to the stack's speed and lack of dynamic allocation coding practices.

Is there a way to find out how far I can push the stack on a given platform? I am looking mainly at mobile devices but the issue could come up on any platform.

johnbakers
  • 24,158
  • 24
  • 130
  • 258
  • Not a duplicate, but have a look at http://stackoverflow.com/q/1756285/1729885 – Niels Keurentjes Jun 22 '13 at 12:54
  • So you want to dynamically (i.e. at runtime) determine the stack size? – ComFreek Jun 22 '13 at 12:54
  • i don't think there is any general solution. and stack size can vary between different threads – Bryan Chen Jun 22 '13 at 12:54
  • Either dynamically or just in general, anyway to know how far a stack can be pushed before overflow. Not necessarily at runtime, but if there are tools to somehow arrive this information. Of course runtime is fine too. – johnbakers Jun 22 '13 at 12:56
  • 2
    If fast allocation is a requirement, and you have a stack-like allocation pattern, and you are on Linux, you can use glibc's obstacks: http://www.gnu.org/software/libc/manual/html_node/Obstacks.html – Alexandre C. Jun 22 '13 at 12:59
  • Note that often time you can turn a recursive algorithm into an iterative algorithm by keeping a stack on the side (heap-allocated). In C++, this heap-stack can be either a `stack` or a `deque` (depending on whether you want LIFO or FIFO) with no manual book-keeping and the over-head is usually fairly low; on a stack-constrained system I would recommend it. – Matthieu M. Jun 22 '13 at 13:37
  • Note that most compilers let you define the size of the stack when you compile your code. So if you have an upper bound on your stack usage you could tell the compiler to use "your_upper_bound + some reasonable value for what else will need to go on the stack (probably just the original default stack size)". – Jesper Juhl Aug 05 '18 at 13:59

5 Answers5

7

On *nix, use getrlimit:

   RLIMIT_STACK
          The maximum size of the process stack, in bytes.  Upon
          reaching this limit, a SIGSEGV signal is generated.  To handle
          this signal, a process must employ an alternate signal stack
          (sigaltstack(2)).

On Windows, use VirtualQuery:

For the first call, pass it the address of any value on the stack to get the base address and size, in bytes, of the committed stack space. On an x86 machine where the stack grows downwards, subtract the size from the base address and VirtualQuery again: this will give you the size of the space reserved for the stack (assuming you're not precisely on the limit of stack size at the time). Summing the two naturally gives you the total stack size.

There is no platform-independent method since stack size is left to the implementation and host system logically - on an embedded mini-SOC there are less resources to distribute than on a 128GB RAM server. You can however influence the stack size of a specific thread on all OS'es as well with API-specific calls.

Community
  • 1
  • 1
Niels Keurentjes
  • 41,402
  • 9
  • 98
  • 136
6

A possible portable solution is to write an allocator yourself.
You do not have to make use of the process stack, just simulate it in the heap.
Allocate a large amount of memory in the beginning, and write a stack allocator on top of it to use it while allocating.
Google 'Allocator Requirements' for information on how to achieve it in C++.

I'm not sure if the term 'Stack Allocator' is canonical, but I mean that you have to put stack like restrictions on where the allocation or deallocation has to happen.
Since you said that your algorithm is suited to this pattern, I think it'd be easy.

manasij7479
  • 1,665
  • 1
  • 14
  • 22
2

In standard C++, definitely not. In a portable way, probably not. In a particular OS, sometimes. If nothing else, you could open your own executable size and inspect the headers of the executable file to see it's stacksize. [The next problem is of course "how much of the stack was used before this bit of code" - which can be difficult to determine].

If you run the code in a separate thread, many of the (low level) thread interfaces allow you to specify a stack (or stacksize), E.g Posix threads pthread_set_stacksize or MS _beginthread. Again, you don't know EXACTLY how much space has been used up before it gets to the actual thread code - but it's probably not a huge amount.

Of course, in an embedded system (e.g. mobile phone), the stacksize is typically quite small, 4K, 12K or 64KB is very much normal - sometimes even a lot smaller than that in some systems.

Another potential problem is that you can't really know how much space is ACTUALLY used on the stack - you can measure after the fact in a compiled system, and of course, if you have a stack local array of int array[25];, we can know it takes up at least 25 * sizeof(int) - but there may be padding, the compiler saves registers on the stack, etc, etc.

Edit, as an afterthought: I also don't really see much benefit in having two code-paths:

 if (enough_stack_space_for_something)
      use_stack_based_algorithm();
 else
      use_heap_based_algorithm();

This would add a fair amount of extra overhead, and more code is generally not a good plan in an embedded/mobile system.

Edit2: Also, if allocating memory is a major part of the runtime, perhaps looking at why that is, for example block-creation of objects would help?

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
2

To expand on the answers already given about why there is no portable way to do this, the entire concept of an actual stack is not part of the standard. You could write a C or C++ runtime that doesn't use a stack at all other than the function call records (which might internally be a linked list or something else).

The stack is an implementation detail of a particular machine/OS/compiler. Hence any technique to access stack metrics will be specific to machine/OS/compiler.

While not an actual answer to your specific question (Niels covered that quite well) but as advice to your problem domain: just allocate a large chunk of memory in the heap. There's no reason aside from convenience that the "real" stack is any different. Highly recursive (non-tail-recursive) algorithms often need to do this to ensure that they have a virtually unbounded "stack." Scripting languages that want to ensure they give a runtime error/exception rather than crashing the host application also often do this. To be efficient about things, you can either implement a "split stack" (like a std::deque would give you) or you can just be sure to preallocate a stack big enough for your needs.

Sean Middleditch
  • 2,517
  • 18
  • 31
  • _There's no reason aside from convenience that the "real" stack is any different._ But it is often mentioned that the stack is faster than the heap, so surely it is about more than convenience? – johnbakers Jun 22 '13 at 21:05
  • _Allocations_ (using `malloc` or `new` or the like) on the heap are slower than creating a local variable on the stack or using something like `alloca`. The stack mostly automatically avoids fragmentation and cache-locality issues that might bite you with heap usage if you're naive but which aren't a problem if you are careful. Some architectures have explicit instructions for an implementation of a stack, e.g. `push` and `pop` on x86, but their benefit will be extremely minimal in the majority of algorithms. Unless you have performance numbers proving you need to, don't worry about it. – Sean Middleditch Jun 22 '13 at 21:25
1

There's no standard way to do it from within the language. I'm not even aware of a documented extension that is able to query.

However some compilers have options to set the stack size. And platform may specify what it does when launching a process, and/or provide ways to set stack size of a new thread, maybe even manipulate existing one.

For small platforms it's usual to know the whole memory size, have all the data segments on one end, a set size arena for the heap (may be 0), and the rest is stack, approaching from the other side.

Balog Pal
  • 16,195
  • 2
  • 23
  • 37