how boost multi_index is implemented

Question

I have some difficulties understanding how Boost.MultiIndex is implemented. Lets say I have the following:

typedef multi_index_container<
    employee,
    indexed_by<    
        ordered_unique<member<employee, std::string, &employee::name> >,
        ordered_unique<member<employee, int, &employee::age> >
    > 
> employee_set;

I imagine that I have one array, Employee[], which actually stores the employee objects, and two maps

map<std::string, employee*>
map<int, employee*>

with name and age as keys. Each map has employee* value which points to the stored object in the array. Is this ok?

Based on the picture [Fig. 1: Diagram of a multi_index_container with three indices](https://www.boost.org/doc/libs/1_70_0/libs/multi_index/doc/tutorial/multi_index_cont_example.png) at [Boost.MultiIndex Tutorial - Rationale](https://www.boost.org/doc/libs/1_70_0/libs/multi_index/doc/tutorial/index.html), its implementation is just what you imagined. — samm, Jul 24 '19 at 11:11

Joaquín M López Muñoz · Accepted Answer · 2020-04-23T07:18:14.000

33

A short explanation on the underlying structure is given here, quoted below:

The implementation is based on nodes interlinked with pointers, just as say your favorite std::set implementation. I'll elaborate a bit on this: A std::set is usually implemented as an rb-tree where nodes look like

struct node
{
  // header
  color      c;
  pointer    parent,left,right;
  // payload
  value_type value;
};

Well, a multi_index_container's node is basically a "multinode" with as many headers as indices as well as the payload. For instance, a multi_index_container with two so-called ordered indices uses an internal node that looks like

struct node
{
  // header index #0
  color      c0;
  pointer    parent0,left0,right0;
  // header index #1
  color      c1;
  pointer    parent1,left1,right2;
  // payload
  value_type value;
};

(The reality is more complicated, these nodes are generated through some metaprogramming etc. but you get the idea) [...]

edited Apr 23 '20 at 07:18

answered Nov 17 '10 at 19:32

Joaquín M López Muñoz

5,243
1
15
20

+1, explains the OP's use case exactly. I wonder if Boost.MultiIndex uses Boost.Intrusive internally? – Fred Foo Nov 17 '10 at 21:38
4

No, Boost.MultiIndex does not use Boost.Intrusive, primarily because the former library is older than the latter. A rewrite in terms of Boost.Intrusive would be in principle doable, though. – Joaquín M López Muñoz Nov 18 '10 at 14:21
1

@JoaquínMLópezMuñoz: I know this is an old answer, however it would be great if you could post an excerpt of your e-mail here. The guideline in SO questions and answers is to provide the details *directly* so that link-rotting or offline viewing do not significantly cut down on their values. – Matthieu M. Jan 08 '14 at 14:29

Fred Foo · Answer 2 · 2010-11-17T21:28:25.340

Conceptually, yes.

From what I understand of Boost.MultiIndex (I've used it, but not seen the implementation), your example with two ordered_unique indices will indeed create two sorted associative containers (like std::map) which store pointers/references/indices into a common set of employees.

In any case, every employee is stored only once in the multi-indexed container, whereas a combination of map<string,employee> and map<int,employee> would store every employee twice.

It may very well be that there is indeed a (dynamic) array inside some multi-indexed containers, but there is no guarantee that this is true:

[Random access indices] do not provide memory contiguity, a property of std::vectors by which elements are stored adjacent to one another in a single block of memory.

Also, Boost.Bimap is based on Boost.MultiIndex and the former allows for different representations of its "backbone" structure.

score 2 · Answer 3 · answered Nov 17 '10 at 16:39

Actually I do not think it is.

Based on what is located in detail/node_type.hpp. It seems to me that like a std::map the node will contain both the value and the index. Except that in this case the various indices differ from one another and thus the node interleaving would actually differ depending on the index you're following.

I am not sure about this though, Boost headers are definitely hard to parse, however it would make sense if you think in term of memory:

less allocations: faster allocation/deallocation
better cache locality

I would appreciate a definitive answer though, if anyone knows about the gore.

how boost multi_index is implemented

3 Answers3

Linked