I'm looking to call a special constructor/destructor for stack allocated instances of a class, and to either:
- Ban all other instances (I think this is impossible).
or
- Make them call different constructor/destructors.
Is this possible to do portably?
Why would I want to do this?
I'm writing a container class library that holds a potentially large dynamically allocated buffer.
Conceptually it should work like:
class SArray(){
size_t size;
char* data;
SArray(size_t _size):size(_size){
data = (char*)malloc(size);
}
~SArray(){
free(data);
}
... Some operators, etc. ...
}
However, this is a major slowdown for users in my field(high performance robotics). When you allocate from the system, big allocations are first mapped to the zero page, causing pagefaults when you write to them, and then -for security reasons- the kernel spends a lot of time zeroing the data before giving it to you. My users tend to allocate many small buffers(sub-KB) and very large buffers (1-1000 MB).
If we can be sure that the constructors and destructors are called in LIFO(stack) order, then we can just have a giant virtual stack to do our memory allocations from:
class SArray(){
static char backing_stack[1000000000000]; // 1TB, at least as big as all of RAM
static sp = 0;
size_t size;
char* data;
SArray(size_t _size):size(_size){
data = backing_stack + sp;
sp += size;
}
~SArray(){
sp -= size;
}
... Some operators, etc. ...
}
Here the pagefaults and zeroing from the kernel only happen the first time we touch part of our backing_stack
[1].
This answer shows that LIFO order is enforced in C++ independently for each storage type, which means that if we declare:
void foo(){
static SArray static_decl(100);
thread_local SArray tl_decl(1000);
SArray local_decl(2000);
}
then strange things could happen if they all use the same backing_stack
.
The obvious and preferred solution is to have constuctors that use different stacks for different storage types.
If that's not possible, we could settle for a class that simply cannot be created except as a local variable, as long as we can automatically detect that someone tried to do it and issue a compiler error. The speed gain of preallocating is so great that I am simply willing to ban other uses if needed.
So far, I have found that you can delete the new, placement new, and delete operators, so someone accidentally heap allocating isn't a big worry. However, unsuspecting users could try to create static or thread_local instances. Temporaries and lifetime extension also seem like they could pose a problem.
A final option I explored and rejected was writing a hash table based allocation scheme. The problem there was that memory would get fragmented if users allocated many containers smaller than page size. Dealing with this is what makes malloc() so slow for small items compared to the call stack.
[1]: We can't use alloca() or equivalent because our allocations are often so large that they blow past the guard page(s), silently corrupting our stack.