0

Given that memory overhead is critical to my application, I think that of the two options above, the latter would be more light weight. Am I correct? I am basing this on the fact that the vector has a memory overhead of 4 pointers to keep track of begin(), end(), size() and allocator. So the total size for the whole model would be in the order of

(4*sizeof(T*) + sizeof(T)*Ni)*No + 3*sizeof(std::vector<T>*)

Here, I am assuming Ni, No to be the number of elements in the inner and outer vectors, resply. By using the latter expression, I am hoping to save the 4*sizeof(T*)*No since in my application, No is huge, while Ni <<<< No. Just to fix ideas, No is in the order of a 100 million and more, Ni is typically in the order 3 to 50.

Thanks in advance for your interest and any ideas.

NOTE: I understand and am more than happy to pay the price of dealing with the pointer incl. allocating, traversing, and deallocating it, and I can do so without any significant performance overhead.

squashed.bugaboo
  • 1,338
  • 2
  • 20
  • 36
  • One of `begin()`, `end()` and `size()` is redundant. `vector` is not that bad at all. –  Jan 09 '14 at 15:12
  • If you are allocating a large number of elements, `std::vector` will likely have an issue as it requires contiguous memory allocation. `std::deque` may help with that, depending on your needs. – Zac Howland Jan 09 '14 at 15:15
  • "In my application, `No` is huge, while `Ni` <<<< `No`." One obvious solution would be switching around the indexes of the inner and the outer vector, unless you need to pass around the inner vector. – Sergey Kalinichenko Jan 09 '14 at 15:18
  • having a good `allocator` for your container is way more important than this kind of questions. Especially if your container will have to provide room for a lot of elements . – user2485710 Jan 09 '14 at 15:23
  • 1
    Are those Ni/No constant values? If not - then you need to store Ni current value in each internal array and store also its current capacity - so seems std::vector is not big waste... – PiotrNycz Jan 09 '14 at 15:24
  • Yes, you're right, I am using the first element in the array to store the Ni – squashed.bugaboo Jan 09 '14 at 15:27

4 Answers4

2

It's actually 4, you missed the allocator. See What is the overhead cost of an empty vector?

Depends on your application. Do you never append to the internal vectors? Do they all have the same number of elements? Is the average size of the data stored in the internal vectors small?

If you answered yes to all the questions above than maybe T* is an acceptable solution. If not think about how would you handle that issue without the support of vector. It might be easier to just take the hit on memory.

Community
  • 1
  • 1
Sorin
  • 11,863
  • 22
  • 26
  • Thank you for the response and the correction; that is even better savings then. Yes, I don't append to it. I am constructing this for quick lookups later on. I need them to be contiguous and I might sort them, but typically Ni is so small compared to No, it doesn't really matter. Mapped implementation (my earlier attempt) was way too expensive memory wise. – squashed.bugaboo Jan 09 '14 at 15:24
  • On linux, it's just three pointers (`sizeof(vector) == 24`), I believe the allocator is stored in the type, not at runtime. Still quite a bit of overhead. – cmaster - reinstate monica Jan 09 '14 at 19:01
1

As you see here, the exact overhead of an std::vector is implementation dependent.

Also note that if No is very large, it's very probable that your data will be stored in chunks in some implementations, in which case, you also have the overhead which is of the order of the number of chunks.

But in general I agree that the pointer implementation is cheaper space-wise.

Community
  • 1
  • 1
adrin
  • 4,511
  • 3
  • 34
  • 50
1

I think that [the vector<T*> would be better. Am I correct?

It would be smaller, but it wouldn't necessarily be "better". The change would saddle you with the necessity to allocate and free inner arrays. You would no longer have a way of knowing the size of the inner array.

Also note that some overhead on size would remain: as long as your inner arrays are allocated individually, there would be some additional storage reserved by the allocator in addition to the size of requested chunk, to let the deallocation routines know the size of the chunk.

If your memory requirements are so tight, consider allocating one vector for the whole array, and then parcel out the individual chunks into a vector of pointers. This would eliminate the per-chunk overhead of allocating the inner arrays indivudually.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

If you are concerned about the overhead of a vector, you should also be concerned about the overhead of malloc()/new: typical memory allocator overhead is at least two more pointers per memory region, that brings the overhead of a small vector<> up to five pointers (sizeof(vector<int>) == 3*sizeof(void*) on linux).

So, what I would do, is to ask myself whether the size of the inner arrays needs to change once they have been initialized. If it is possible to avoid later reallocation of those arrays, I would allocate one huge chunk of memory, which I can then distribute to the different inner arrays, storing only their location:

int** pointerArray = new int*[innerArrayCount + 1];
int* store = new int[totalSizeOfInnerArrays];
for(int* nextArray = store, i = 0; i <= innerArrayCount; i++) {
    pointerArray[i] = nextArray;
    nextArray += innerArraySize[i];
}

The size of an array can then be deduced from the difference of the next pointer and its own:

for(int i = 0; i < innerArrayCount; i++) {
    int* curArray = pointerArray[i];
    size_t curSize = pointerArray[i + 1] - pointerArray[i];
    //Do whatever you like with curArray.
}

Or, you can directly use that end pointer for iterating over the inner arrays:

for(int i = 0; i < innerArrayCount; i++) {
    for(int* iterator = pointerArray[i]; iterator < pointerArray[i + 1]; iterator++) {
        //Do whatever you like with *iterator.
    }
}
cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106