Short story: I need a container that can store an unlimited number of elements, but does no dynamic memory allocation for a small number of elements.
Long story: I need to store a couple of integers, the number of which is not bounded, but rarely more than two or three. I tried using a vector<int>
or list<int>
, but the performance as compared to a simple array is worse by a factor of at least 50 in release mode, and 500 in debug mode. For example, I measured creating a vector
, calling push_back
three times, and destroying it again. This takes around 250ns. Calling reserve(3)
upfront helps a bit, the time goes down to 100ns. Using an array instead takes like 3ns, but I cannot use a fixed-size array because the number of elements is unbounded. I need to store a lot of those things, as part of larger objects.
So I guess what I'm asking for is a container that uses some sort of small vector optimization: As long as the number of elements is small they are stored within the object itself, and only when it grows beyond a certain limit it allocates memory on the heap. Does such a thing exist?
As a side node, basic_string
is known to have this sort of optimization, so I tried basic_string<int>
. Indeed, for a size of 3 it's four times faster than vector<int>
, but somehow using a string class for storing integers doesn't feel right. So again, my question: Is there some container that performs well for small size, but if needed can grow unlimited?
EDIT People asked for the actual code. Well, there is nothing special about it, it's straighforward. I cannot post it because it's part of a large benchmarking environment, with special purpose classes for timing, and some classes for obfuscating the code to prevent the compiler from replacing it with {}
. You know, it does that when it notices the result isn't used. The core of it comes down to this:
vector<int> v;
v.push_back(17);
v.push_back(19);
v.push_back(23);
and
int x[3];
int k = 0;
x[k++] = 17;
x[k++] = 19;
x[k++] = 23;
each of which is run in a tight timing loop.
EDIT2 The use case for this is a tree with a variable number of branches at each node. So I actually need to store pointers and no integers, but this is on 32-bit so both are the same size. Performance of this is important, because the tree is constantly modified.
Edit3 People keep doubting my numbers. Why do you think they are unreasonable? A vector
does dynamic memory allocation, the array doesn't, and this seems to be a good explanation for the difference, or isn't it?