4

I wonder which container uses less memory between std::map and std::vector with a large set of data.

Loads of posts talk about efficiency, and my priority is not efficiency but memory consumption. So, if we don't know the number of our data (in my case can be over 12,000,000 entries, every entry is a string of 20 characters), is map really better than vector?

Community
  • 1
  • 1
Atouk
  • 53
  • 2
  • 6
  • 1
    `map` and `vector` do different things: `map` stores key-value pairs, not single values. Typically, associative containers have more overhead since they store pointers alongside the data and allocate each node individually; but `vector` needs contiguous memory, and might cause fragmentation if you let it reallocate too often. The only real answer is to measure. – Mike Seymour May 18 '15 at 13:11
  • 2
    Vector contains a contiguous array, so it takes exactly `size() * sizeof(value_type)` bytes of dynamic storage. Map has one dynamic node per entry, which adds fair amount of overhead. – Kerrek SB May 18 '15 at 13:11
  • I'd consider rolling my own here - with 12 million entries you could justify it - how much memory are you trying to save? A container for 12 million sets of 20 characters is likely to be better than a container for 12 million containers of 20 characters (aka strings). For that matter, 20 character strings of [a-z] have the order of 10^28 possible values, in which case representing each string as a 96 bit integer is a possibility? How weird are you prepared to get? – Andy Newman May 18 '15 at 13:24
  • You can take a look at http://info.prelert.com/blog/stl-container-memory-usage and http://stackoverflow.com/questions/11841357/memory-consumption-by-stl-containers – manlio May 18 '15 at 13:25
  • @AndyNewman, I know but for each string, i've got a container of other data. Each string is different. – Atouk May 18 '15 at 13:31
  • If you are worried about getting large enough individual blocks, deque is a strictly better choice than map. – Nir Friedman Sep 21 '17 at 23:52

4 Answers4

3

A std::vector must organise the strings in contiguous memory. (The standard insists on that). So the amount of contiguous memory in your example will be at least sizeof(string) * 12,000,000 for a std::vector. Fortunately, each string probably has its buffer in the heap: 20 characters is around the cut-off for std::string implementations that use a fixed buffer for short strings.

std::map will not have this contiguity issue, so is probably a better container to use in this instance. But, overall, it will probably consume more memory. But that memory will be easier for the program to acquire.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
2

Depends on the problem you are solving. Basically , the std::vector uses a row memory(as your data is big, make sure you have one), but the std::map can take each node from separate parts of the memory. Opposite, std::map uses more memory for the same data because the pointer manipulation between nodes.

Eduard Rostomyan
  • 7,050
  • 2
  • 37
  • 76
  • But at resize, does a vector copy all his data in another contiguous memory row? Or does it keep two rows? – Atouk May 18 '15 at 13:28
  • if your vectors size is n, and you call resize(m) there m > n , it just adds the necessary m-n part to your vector, if m < n , it just deletes the remaining n-m part. So it never keep two rows in case of resize. – Eduard Rostomyan May 18 '15 at 13:31
  • It's not true, if m – Atouk May 18 '15 at 13:35
  • 3
    @Atouk, resizing a vector to a smaller size deletes content. Making the vector larger adds elements to the end as long as the capacity is not reached. In that case it is reallocated and moved to a new memory-area: http://www.cplusplus.com/reference/vector/vector/resize/ – PureW May 18 '15 at 13:43
2

For the specific sizes you note, you might want to consider something that does not go to either of the extremes: not a continuous array of memory, and not a single-node based tree.

Some options are

  • a (memory residing) B tree

  • a digital tree

Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
1

If you want to keep data in continuous memory you should go for std::vector else if you prefer node based data structure and need to a lot of insert and delete operations I will suggest either std::list instead std::map.

If you prefer node based data structure with keeping order between elements of data structure and don't have a key value pair I prefer std::set instead std::map.

If you prefer node based data structure with data as a key value pair and keeping order between elements depends of key value of data I prefer std::map

Steephen
  • 14,645
  • 7
  • 40
  • 47