46

For example, I have a std::map with known sizeof(A) and sizeof(B), while map has N entries inside. How would you estimate its memory usage? I'd say it's something like

(sizeof(A) + sizeof(B)) * N * factor

But what is the factor? Different formula maybe?

Maybe it's easier to ask for upper bound?

ulidtko
  • 14,740
  • 10
  • 56
  • 88
Drakosha
  • 11,925
  • 4
  • 39
  • 52

7 Answers7

37

The estimate would be closer to

(sizeof(A) + sizeof(B) + ELEMENT_OVERHEAD) * N + CONTAINER_OVERHEAD

There is an overhead for each element you add, and there is also a fixed overhead for maintaining the data structure used for the data structure storing the map. This is typically a binary tree, such as a Red-Black Tree. For instance, in the GCC C++ STL implementation ELEMENT_OVERHEAD would be sizeof(_Rb_tree_node_base) and CONTAINER_OVERHEAD would be sizeof(_Rb_tree). To the above figure you should also add the overhead of memory management structures used for storing the map's elements.

It's probably easier to arrive at an estimate by measuring your code's memory consumption for various large collections.

Diomidis Spinellis
  • 18,734
  • 5
  • 61
  • 83
  • 1
    While i totally agree with you, i'd like to have some upper bound without knowing internal STL struct sizes. – Drakosha Apr 06 '09 at 10:02
  • Are there not going to be N tree nodes? – Martin York Apr 06 '09 at 15:41
  • I disagree with this estimate, generally leaf nodes have the same size as non leaf nodes, so the '+ TREE_NODE_SIZE * log2(N)' part is not accurate. – Don Neufeld Apr 06 '09 at 19:25
  • 3
    For MSVC (VS2010), I found the corresponding node construct in file xtree::class _Tree_nod::struct _Node. A Node contains 3 _NodePtrs (left, right, parent -> 12 / 24 byte on 32/64 bit systems) plus 2 chars (_Color and _Isnil) in addition to a value_type (which is std::pair). This leaves me with an element overhead of 14/26 bytes per map entry. – Daniel Aug 31 '18 at 07:38
  • @Diomidis, is there a possibility to get the sizeof(_Rb_tree)? I get "missing template arguments" error...thanks – Taw Nov 25 '18 at 17:35
  • @Taw: the structure names depend on the implementation. Look inside the header file to see the actual name and type. In your case it seems you also need to supply the element type as an argument. – Diomidis Spinellis Nov 26 '18 at 20:24
  • Thank you for the explanations, but I have problems getting the `_Rb_tree_node_base` and `_Rb_tree` symbols, the compiler says that were not declared. I included , I also included "stl_tree.h" (full path) and still it does not work... – Taw Nov 27 '18 at 11:03
  • Your compiler might be using another implementation with different symbols. To find them you need to examine the header file and the header files it might be including. – Diomidis Spinellis Nov 28 '18 at 16:45
20

You could use MemTrack, by Curtis Bartley. It's a memory allocator that replaces the default one and can track memory usage down to the type of allocation.

An example of output:

-----------------------
Memory Usage Statistics
-----------------------

allocated type                        blocks          bytes  
--------------                        ------          -----  
struct FHRDocPath::IndexedRec          11031  13.7% 2756600  45.8%
class FHRDocPath                       10734  13.3%  772848  12.8%
class FHRDocElemPropLst                13132  16.3%  420224   7.0%
struct FHRDocVDict::IndexedRec          3595   4.5%  370336   6.2%
struct FHRDocMDict::IndexedRec         13368  16.6%  208200   3.5%
class FHRDocObject *                      36   0.0%  172836   2.9%
struct FHRDocData::IndexedRec            890   1.1%  159880   2.7%
struct FHRDocLineTable::IndexedRec       408   0.5%  152824   2.5%
struct FHRDocMList::IndexedRec          2656   3.3%  119168   2.0%
class FHRDocMList                       1964   2.4%   62848   1.0%
class FHRDocVMpObj                      2096   2.6%   58688   1.0%
class FHRDocProcessColor                1259   1.6%   50360   0.8%
struct FHRDocTextBlok::IndexedRec        680   0.8%   48756   0.8%
class FHRDocUString                     1800   2.2%   43200   0.7%
class FHRDocGroup                        684   0.8%   41040   0.7%
class FHRDocObject * (__cdecl*)(void)     36   0.0%   39928   0.7%
class FHRDocXform                        516   0.6%   35088   0.6%
class FHRDocTextColumn                   403   0.5%   33852   0.6%
class FHRDocTString                      407   0.5%   29304   0.5%
struct FHRDocUString::IndexedRec        1800   2.2%   27904   0.5%
Xavier Nodet
  • 5,033
  • 2
  • 37
  • 48
16

If you really want to know the runtime memory footprint, use a custom allocator and pass it in when creating the map. See Josuttis' book and this page of his (for a custom allocator).

Maybe it's easier to ask for upper bound?

The upper bound will depend on the exact implementation (e.g. the particular variant of balanced tree used). Maybe, you can tell us why you need this information so we can help better?

dirkgently
  • 108,024
  • 16
  • 131
  • 187
  • Thats the *only* way to get an accurate number. – David Lehavi Apr 06 '09 at 08:32
  • Can STL internal allocator supply information how much memory totally is used by STL? – Drakosha Apr 06 '09 at 10:04
  • @Drakosha: No, that allocator is not meant to do this. Which is why you need to create a custom one that will report whatever you want it to. – dirkgently Apr 06 '09 at 13:18
  • 2
    Stephan T. Lavavej also has a nice article on custom allocators: http://blogs.msdn.com/vcblog/archive/2008/08/28/the-mallocator.aspx Read the comments for a bit of trivia about vs. use. Now I can feel justified in my continued use of the .h variant of C headers in my C++ code! – Michael Burr Apr 06 '09 at 19:13
  • @Michael: Why headers all of sudden? Some context will be of help! – dirkgently Apr 06 '09 at 19:50
10

I recently needed to answer this question for myself, and simply wrote a small benchmark program using std::map I compiled under MSVC 2012 in 64-bit mode.

A map with 150 million nodes soaked up ~ 15GB, which implies the 8 byte L, 8 byte R, 8 byte int key, and 8 byte datum, totaling 32 bytes, soaked up about 2/3rds of the map's memory for internal nodes, leaving 1/3rd for leaves.

Personally, I found this to be surprisingly poor memory efficiency, but it is what it is.

Hope this makes for a handy rule-of-thumb.

PS: The overhead of a std::map is that of a single node's size AFAICT.

user2548100
  • 4,571
  • 1
  • 18
  • 18
  • 7
    In you need better memory efficiency, you could go with http://code.google.com/p/cpp-btree/ – Drakosha Sep 09 '13 at 13:24
  • 1
    See my comment in Diomidis Spinellis answer. I found element overheads per map entry of 3 pointers + 2 chars -> 14/26 bytes on 32/64 bit in addition to the key value pair. The fraction of overhead will of course depend on the sizes of your keys and values. – Daniel Aug 31 '18 at 07:43
0

I was also looking for something to calculate the size of the std::map. I have tried what was explained in Diomidis Spinellis's answer and expanded his answer over here which could be helpful to others.

I am expanding on his answer by adding few lines of code.

#include <bits/stl_tree.h>
int main(int argc, char *argv[])
{
    std::cout << sizeof(std::_Rb_tree_node_base) << std::endl;
    return 0;
}

Outputs (On my ARM Cortex A-9 iMX6Solo-X processor running Linux [4.9.175] and compiler: arm-fslc-linux-gnueabi-gcc (GCC) 7.3.0):

16

Considering std::map<A, B>, I am interested in size of ELEMENT_OVERHEAD as it grows linearly with the number of elements present in the map. ELEMENT_OVERHEAD was found to be equivalent of sizeof(std::_Rb_tree_node_base) as hence has a value of 16 for my system.

(sizeof(A) + sizeof(B) + ELEMENT_OVERHEAD) * N + CONTAINER_OVERHEAD
abhiarora
  • 9,743
  • 5
  • 32
  • 57
0

The formula is more like:

(sizeof(A) + sizeof(B) + factor) * N

where factor is the per entry overhead. C++ maps are typically implemented as red-black trees. These are binary trees, so there will be at least two pointers for the left/right nodes. There will also be some implementation stuff - probably a parent pointer and a "colour" indicator, so factor may be something like

(sizeof( RBNode *) * 3 + 1) / 2

However, all this is highly implementation dependent - to find out for sure you really need to examine the code for your own library implementation.

-2

The size of the map really depends on the implementation of the map. You might have different sizes on different compilers/platforms, depending on which STL implementation they are providing.

Why do you need this size?

Cătălin Pitiș
  • 14,123
  • 2
  • 39
  • 62
  • Ok, g++. I need this size to know hom much entries i can hold in memory before i start swapping. I need upper bound. – Drakosha Apr 06 '09 at 10:05
  • If you need the upper bound, I suggest to take a look to the implementation of the map, to see how a map node object is implemented and to consider the size of the members of the map node as extra to the size of the key and element. Be aware that this approach is platform and compiler specific. – Cătălin Pitiș Apr 06 '09 at 12:35
  • Disable the virtual system, and write a benchmark to slowly increase the size of a test map. You'll know soon enough by watching Windows Task Manager's Peak Working Set metric when you run out of memory. –  Sep 09 '13 at 05:49