I am writing a container that uses alloca
internally to allocate data on the stack. Risks of using alloca
aside, assume that I must use it for the domain I am in (it's partly a learning exercise around alloca
and partly to investigate possible implementations of dynamically-sized stack-allocated containers).
According to the man
page for alloca
(emphasis mine) :
The alloca() function allocates size bytes of space in the stack frame of the caller. This temporary space is automatically freed when the function that called alloca() returns to its caller.
Using implementation-specific features, I have managed to force inlining in such a way that the callers stack is used for this function-level "scoping".
However, that means that the following code will allocate a huge amount of memory on the stack (compiler optimisations aside):
for(auto iteration : range(0, 10000)) {
// the ctor parameter is the number of
// instances of T to allocate on the stack,
// it's not normally known at compile-time
my_container<T> instance(32);
}
Without knowing the implementation details of this container, one might expect any memory it allocates to be free'd when instance
goes out of scope. This is not the case and can result in a stack overflow / high memory usage for the duration of the enclosing function.
One approach that came to mind was to explicitly free the memory in the destructor. Short of reverse engineering the resulting assembly, I haven't found a way of doing that yet (also see this).
The only other approach I have thought of is to have a maximum size specified at compile-time, use that to allocate a fixed-size buffer, have the real size specified at runtime and use the fixed-size buffer internally. The issue with this is that it's potentially very wasteful (suppose your maximum were 256 bytes per container, but you only needed 32 most of the time).
Hence this question; I want to find a way to provide these scope semantics to the users of this container. Non-portable is fine, so long as it's reliable on the platform its targeting (for example, some documented compiler extension that only works for x86_64 is fine).
I appreciate this could be an XY problem, so let me restate my goals clearly:
- I am writing a container that must always allocate its memory on the stack (to the best of my knowledge, this rules out C VLAs).
- The size of the container is not known at compile-time.
- I would like to maintain the semantics of the memory as if it were held by an
std::unique_ptr
inside of the container. - Whilst the container must have a C++ API, using compiler extensions from C is fine.
- The code need only work on x86_64 for now.
- The target operating system can be Linux-based or Windows, it doesn't need to work on both.