2

I'm writing a generalised framework for games and real-time applications, compiled with gcc. For purposes of efficiency, I'd like all instances of core framework structs to be allocated on the stack.

Depending on the size of projects built using the framework, it seems that stack size may in some instances need to be increased from the typical default (1-2 MB?).

Can stack size be changed on all platforms which gcc supports? Are there ever hard limits in OSes that prevent the stack size from being increased? Are there any other typical issues faced when increasing stack size, including when using multiple threads?

Engineer
  • 8,529
  • 7
  • 65
  • 105
  • 2
    If you have more than 2 Mb worth of data structures, chances are that you could allocate some, if not most of them, in the static memory, Extremely large things are not usually meant to go on the stack. – Sergey Kalinichenko Nov 05 '14 at 18:48
  • And increasing the stack is definitely operating system specific. Not the same on Linux and on FreeBSD (or on Windows or on MacOSX). So you should tell us more about your OS. – Basile Starynkevitch Nov 05 '14 at 18:57
  • I actually edited that out because of the threat to close (apparently "too broad") @BasileStarynkevitch, but it was intended to be advice across as many gcc-supported platforms as possible including mobile, consoles etc. – Engineer Nov 05 '14 at 19:05
  • 4
    There's nothing that makes the stack "faster" from a memory access prospective; it's just cheap to allocate and deallocate because allocate is "increment pointer" and deallocate is "decrement pointer". For things you allocate once and reuse across frames it gains you nothing to do that. – Billy ONeal Nov 05 '14 at 19:18
  • @BillyONeal Absolutely wrong. Refer http://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap and the various answers given there. – Engineer Nov 05 '14 at 19:21
  • Related: http://stackoverflow.com/questions/15192006/allocating-a-new-call-stack ; answers suggest that if you're willing to write a small amount of platform specific asm, you can basically just `malloc` a new stack on most platforms, any size you like. – Alex Celeste Nov 05 '14 at 19:24
  • 1
    Also @BillyONeal is absolutely right. Re-read your own link, it explicitly says there is *no* speed advantage to stack memory access in the answers. – Alex Celeste Nov 05 '14 at 19:25
  • 1
    @NickWiggill: Almost all of those pitfalls refer to allocation time, which as Bill said, is nearly irrelevant if only done once. The only remaining point is that "stack tends to be reused very frequently which means it tends to be mapped to the processor's cache", which pretty much doesn't apply to "globals" at the bottom of the stack. – Mooing Duck Nov 05 '14 at 19:25
  • @Leushenko You obviously didn't read the two answers I read, then :) – Engineer Nov 05 '14 at 19:26
  • @MooingDuck Fair enough. Nonetheless, the original question stands. – Engineer Nov 05 '14 at 19:27
  • well, there's also the obvious extra indirection, memory locality issues, and the allocations has to be done separately. but if you put large things on the stack, all those benefits will be negligible compared to the total cost (~ amortized cost) – Karoly Horvath Nov 05 '14 at 19:29
  • 1
    Besides allocation, the caching advantage happens automatically for *any* frequently-used memory. Global or heap memory will be cached as well if you actually use it frequently enough to warrant it; the stack isn't cached because it's special, it's usually cached because it's usually *used*. – Alex Celeste Nov 05 '14 at 19:31
  • @Leushenko Am fully aware of the caching aspect and how it would relate equally to heap and stack, thank you. The salient points here are (a) the stack is contiguous -- already a big win here, why wrangle with the heap if all framework instances can exist comfortably on contiguous stack; (b) top of stack is stored in a fast register for rapid access (FWIW); (c) stack memory tends _not_ to have synchronisation concerns affecting it, unlike heap memory. And finally and less importantly (though not unimportant) the faster allocations. – Engineer Nov 05 '14 at 19:38
  • cache is a scarce resource, and the heap allocations usually aren't as packed, consequently the memory not as good utilized (though modern allocators can get pretty close to it by using separate pools for different sizes - but then you're back to the memory locality issue). – Karoly Horvath Nov 05 '14 at 19:38
  • @KarolyHorvath Agreed... feel like we're going around in circles here. Original question still stands! :) – Engineer Nov 05 '14 at 19:40
  • 2
    You're misreading those answers. Stack memory and heap memory have *exactly the same* synchronisation concerns, post-allocation (those answers are just saying that malloc itself has to be atomic, *not* memory reads). Either you're not sharing memory, in which case the is no issue, or you are sharing memory in which case you have to synchronise the stack as well. It's not magic. Contiguity is also a non-issue if you're allocating in bulk. – Alex Celeste Nov 05 '14 at 19:47
  • 2
    guys, the OP is right, this is not the right place for this discussion. post a question / go to a chat / etc. – Karoly Horvath Nov 05 '14 at 19:49
  • this parameter (for i386pc:) can be used with gcc: --stack and this parameter: --heap – user3629249 Nov 06 '14 at 02:17
  • this parameter (for elf_i368) can be used with gcc: -z stacksize=SIZE – user3629249 Nov 06 '14 at 02:23
  • this parameter (for elf32-_x86_64) can be used with gcc: -z stacksize=SIZE – user3629249 Nov 06 '14 at 02:24
  • this parameter (for elf_x86_64) can be used with gcc: -z stacksize=SIZE – user3629249 Nov 06 '14 at 02:25
  • this information on available parameters can be gotten by (from the command line) gcc -v -Wextra --help – user3629249 Nov 06 '14 at 02:37
  • --verbose [=NUMBER] Output lots of information during link – user3629249 Nov 06 '14 at 02:47

1 Answers1

0

(For lack of answers so far to my question, some noteworthy points I've found in researching this...)

gcc can set stack size for the executable in question, but only for Windows executables, not *nix / ELF binaries. MSVC can do the same for Windows executables, obviously.

Under *nix, it is instead necessary to set stack size using ulimit -s <sizeMB>, which applies to all executables run thereafter, rather than being specific to a certain executable as in the Windows case. Or before creating POSIX threads, pthread_attr_setstacksize() can be used to set default stack size for those threads. Until others share their experiences, I can only assume that for all (gaming) platforms that are *nix including Android, MacOS, iOS, PlayStation (all), the above will apply.

Beyond system-specific hardware limitations, there should be no upper limit on stack size; notably under *nix, ulimit -s unlimited can set a dynamic, theoretically unlimited runtime stack size.

Nonetheless I leave it to commenters to bring to my attention related issues or incorrect assumptions.

Engineer
  • 8,529
  • 7
  • 65
  • 105