1

When I try to google this, all I find is stuff about getting and setting the stack limit, such as -[NSThread stackSize], but that's NOT what I want. I want to know how much memory is in actually in use on the stack in the current thread, or equivalently how much stack space remains available.

I'm hoping to figure out a stack overflow in a crash report submitted by a user. In my previous experience, a stack overflow has usually been caused by an infinite recursion, but not this time. So I'm wondering if some of my C++ functions are really using a heck of a lot more stack space than they should.


A comment suggested that I get the stack pointer at the start of the thread, and compare its value later. I happened across the question Print out value of stack pointer. It has several answers:

  1. (The accepted answer) Take the address of a local variable.
  2. Use a little assembly language to get the value of the stack pointer register.
  3. Use the function __builtin_frame_address(0) in GCC or Clang.

I tried those techniques (Apple Clang, macOS 11.2). Methods 2 and 3 produced similar results, but method 1 produced absurdly different results. For one thing, method 1 gives values that increase as you go deeper into a call chain, while the others give values that decrease. What's up with this, are there two different kinds of stacks?

JWWalker
  • 22,385
  • 6
  • 55
  • 76
  • Not tested, but can you `pthread_attr_getstack()` and compare to the current value of the stack pointer? – Nate Eldredge Feb 12 '21 at 01:10
  • Or you could save the value of the stack pointer in thread-local storage when the thread is first started, and compare with it later. – Nate Eldredge Feb 12 '21 at 01:11
  • @NateEldredge, I tried `pthread_attr_getstack` and it returned NULL as the stackaddr parameter, and no error. That does not make any sense to me. – JWWalker Feb 12 '21 at 22:11
  • Perhaps it only returns a value previously set by `pthread_attr_setstack`. Too bad. – Nate Eldredge Feb 12 '21 at 22:22
  • 2
    Stacks on x86 and arm64 grow down, and no, there are not two kinds. If Method #1 is showing numbers that go up, then either something is wrong with your test code, or compiler optimizations are affecting it. Can you show your test case? – Nate Eldredge Feb 13 '21 at 18:26
  • 1
    @NateEldredge You're right, it was a compiler setting: Address Sanitizer (ASAN) with the option "detect use of stack after return" causes the strange values from method #1. – JWWalker Feb 15 '21 at 21:40
  • Ah that makes sense, it probably "poisons" pointers to local variables after the function returns, since it would be an error to use such a pointer. – Nate Eldredge Feb 15 '21 at 21:43

1 Answers1

2

If you are trying to do that, I guess you want to know how much memory are you using to guess the optimum number of threads you can create of some kind.

The answer is not easy, as you normally don't have access to the stack pointer. But I'll try to devise a solution for you that will not require to access the stack pointer, while it requires to use a global variable per thread.

The idea is to force a parameter to be in the stack. Even if the ABI in your system uses register to pass parameters, if you save the address of a parameter (the actual parameter variable) into some local variable, and then after that you call a function, that takes a parameter (the type doesn't matter, as you are going to use it's address to compare both):

static char *initial_stack_pseudo_addr;

size_t save_initial_stack(char dumb)
{
    /* the & operator forces dumb to be implemented in the stack */
    initial_stack_pseudo_addr = &dumb;
}

size_t how_much_stack(int dumb)
{
    return initial_stack_pseudo_addr - &dumb;
}

So when you start the thread, you call save_initial_stack(0);. When you want to know how much stack you have consumed, just can do the following:

    size_t stack_size = how_much_stack(0);
    printf("at this point I have %zi bytes of stack\n", stack_size);

Basically, what you have done is to calculate how many bytes are between the address of the local parameter of the call to save_initial_stack() to the address of the local parameter of the call you do now to get the stack size. This is approximate, but the stack changes too quick to have a precise idea.

The following example will illustrate the thing. A recursive function is called after setting the initial pointer value, then at each recursive call the current size of the stack (approximate) is computed and printed, and a new recursive call is made. The program should run until the process gets a stack overflow.

#include <stdio.h>

char *stack_at_start;

void save_stack_pointer(char dumb)
{
    stack_at_start = &dumb;
}

size_t get_stack_size(char dumb)
{
    return stack_at_start - &dumb;
}

void recursive()
{
    printf("Stack size: %zi\n", get_stack_size(0));
    recursive();
}

int main()
{
    save_stack_pointer(0);
    recursive();
}
Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
  • Unfortunately such approaches are hard to test, since the compiler might optimize tail recursion into a loop, [as `gcc -O2` does with your example](https://godbolt.org/z/fGbEzb). Then you won't be able to tell whether it's working. – Nate Eldredge Feb 15 '21 at 17:57
  • Yes, and you'll see that the stack doesn't grow. It is difficutlt to think that the compiler will optimize it, when you are adding a new parameter that is forced to be in the stack at each step.... but who knows what optimization can do. hadn't you the printf call, then the optimizer should substitute all by a simple stack overflow signal... but you wan to print the values of the pointers... you need the optimizer to conserve the recursive calls. – Luis Colorado Feb 15 '21 at 19:55
  • the sample was indeed a test. a test to see if the stack grows or not. Indeed, when the compiler converts a tail recursive functiion into a loop, it is effectively changing the code, and taking advantage on that the parameter values are always constants. Should I used `random()` as the value of the parameter, the predictability of the parameters couldn't be ensured, and probably no such optimization was made. – Luis Colorado Feb 15 '21 at 20:04