2

Working on some legacy code, I am running into memory issues due mainly (I believe) to the extensive use of STL maps (particularly “maps-of-maps”.)

I am looking at Boost flat_map as a possible solution. Does anyone have any firsthand experience with flat_maps, in particular with regards improvements in speed and/or memory usage? I realize of course this can be very dependent on the types of data stored and the manner in which they are stored but still curious of folk’s actual experience.

Can anyone point me to some solid examples?

As an example: there are several cases in this code of a map-of-a-map; that is, a map where the value is another map.

By replacing the “inner” map with a pair of vectors, I reduced the memory footprint 10:1 (3G to 300M). Of course this can slow down searches but for this particular case it doesn’t seem to matter much. And it involved about a day of refactoring and careful testing.

Boost’s flat_map sounds like it might be just what I need but I can’t seem to find out much about it other than the class description on the Boost web site. Looking for some firsthand feedback.

BernieP
  • 447
  • 1
  • 7
  • 18
  • What do you mean by "memory issues"? Could you be more specific? – beerboy Mar 07 '13 at 02:31
  • Excessive memory. Is there any other kind? Seriously, I've since run some tests using flat_map and it seems to work for my purposes. It isn't quite as efficent memory-wise as using a pair of vectors but alomost as good and certainly easier to refactor. – BernieP Mar 08 '13 at 15:17
  • Some things you could check for: Are you collecting garbage (meaning, are you removing the index from the map, not just setting it to 0)? And could you save space by representing your types as smaller types (e.g. use enums instead of strings, if you have lots of duplicate strings)? – EHuhtala Mar 18 '13 at 19:00
  • [this post](http://stackoverflow.com/questions/15625225/map-vs-unordered-map-for-few-elements) may be helpful (see comments in answer) – default Mar 25 '13 at 22:21
  • If your data that is stored in the vectors is sorted (or can be sorted) then you could use std::lower_bound() without little change. This does a binary search which is O(log N). This is similar to boost::flat_map in performance for a some value types. – Crog Aug 28 '13 at 11:59
  • @Crog As it turns out the sorted order isn't important. I've stuck with a map-of-pair-of-vectors for now though I've since used flat_maps in differnt cases with much success. Thanks. – BernieP Aug 29 '13 at 15:16
  • @pauld I've seen that post, it is interesting informatino but not quite what I was looking for. My maps are quite large. Using a map-of-a-pair-of-vectors seems to be gining me the best performance for my case. – BernieP Aug 29 '13 at 15:18

2 Answers2

4

Boost's flat_map is a binary-tree-based map implementation, except that that binary tree is stored as a (sorted) vector of key-value pairs.

You can basically figure out the answers regarding performance (relative to an std::map by yourself based on that fact:

  • Iterating the map or a large part of it should be super-fast, relatively
  • Lookup should typically be relatively fast
  • Adding or removing values is theoretically much slower, but in practice - assuming your key and value types are small and the number of map elements not very high - probably comparable in speed (or even better on small maps - often no allocation is necessary on insert)
  • etc.

In your case - maps-of-maps - you're going to lose some of the benefit of "flattening things out", since you'll have an outer map with a pointer to an inner map (if not more levels of indirection); but the flat map would at least help you reduce that. Also, supposing you have two levels of maps, you could arrange it so that you store all of the inner maps contiguously (either by constructing the inner maps appropriately or by instantiating them with your own allocator, a trickier affair); in that case, you could replace pointers to maps with map indices, reducing the amount of space they take up and making life easier for the compiler.

You might also want read Boost's documentation of flat_map; and you could also just use the force and read the source (and the source of the underlying flat_tree) - like I have; I dont actually have flat_map experience myself.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
1

I know this is an old question, but this might be of use to someone finding this question.

I found that flat_map was a big improvement in searching, lookup and iterating large maps. The fact the map is using contiguous data in memory also makes inserting faster than you might expect due to great data locality. If you're doing more inserts than lookups in your map then it might not be for you.

Having said that, repeatedly inserting a random value into a sorted vector is faster than the same on a linked list because of the data locality - despite what Big O might tell you. (tested in VS2017 and G++ 4.8).

fwg
  • 1,018
  • 2
  • 10
  • 25