(moving/expanding from the comments)
what is the better feature do we have in multiset that is not in vector?
The key feature is fast (O(log N)) lookup "equivalent" (according to the comparer) elements, plus implicit sorting according to the comparer (with O(log N) cost in insertion)).
(this plus the usual set
features, such as O(1) removal (after O(log N) search) and references that are never invalidated)
multiset
is quite a "niche" container; in 10+ years of writing C++ I don't think I ever used it (or seeing it being used, FWIW). Most probably, its very existence owes much to the map
/set
<=> multimap
/multiset
symmetry.
set
is generally thought as "the container which does not allow duplicate elements", so having a set that allows duplicates apparently makes no sense at all. However, the actual point of set
is to allow fast (O(log n)) lookup of objects according to some criterion, which crucially may not consider the complete content of the object.
Let's make a step back; it's helpful here to understand that ?map
is just a ?set
in disguise1. In particular, you can think of a std::?map<K,V>
as a std::?set<pair<const K,V>,KeyComp>
, where KeyComp
is a comparer that only considers the first
(=key) part of the pair2 - which incidentally is exactly what std::?map::value_compare
is; you'll notice also that the type of std::?map::value_type
is indeed std::pair<const K, V>
.
So, ?map
is essentially a convenience over ?set
for the common case where you want to index values by some key that is not already "inside the value itself".
If instead the key is already stored inside the value - which is generally what happens when we are indexing existing data by some of its properties - a ?set
with a custom comparer can be used, thus avoiding the key replication of the key that would be required in a ?map
.
Let's consider an imaginary book database; you have a big std::deque<std::unique_ptr<Book>>
that contains all the book objects, and you want to index them by author
and by title
for quick lookup.
In this case, you can use a multiset<Book *, CustomComp>
for each field you want to index; the custom comparer will implement a <
operator that considers just a specific field of the pointed element.
When a new book is added you just need to add it to the deque
and to all the indexes; when a book is removed you have to remove it from the deque
and from the indexes. An edit requires to first remove it from the indexes, apply the changes and then re-add it (straight modification can lead to an inconsistent state of the multiset
instances, because you may be changing the ordering relation between the stored objects "behind their back").
It's perfectly fine here that you have a set which allows duplicate elements: they are duplicate only according to the comparer, which just considers one field; it's perfectly normal to have multiple books by the same author, or different books with the same title. The point here is not "not allowing duplicates", it's "fast lookup by some key", exactly as if we were using a multimap
, but without having to keep an extra copy of the key inside the indexes.
Notes
- When I write
?set
or ?map
I'm talking about both the "regular" and "multi" variants. It's worth noting that the core of the discussion applies mostly in the same way even to the unordered
counterparts, changing the comparer with the hash function.
- Of course, when doing a lookup you have to provide a dummy
second
argument, and that's why using a map
is more convenient.