Why are c++ standard maps very famous for, whereas an array of pairs is also similar those?
When would it be better to use c++ standard maps, and when to use an array of pairs? Or are the applications similar for both?
Why are c++ standard maps very famous for, whereas an array of pairs is also similar those?
When would it be better to use c++ standard maps, and when to use an array of pairs? Or are the applications similar for both?
Why are c++ standard maps very famous for, whereas an array of pairs is also similar to those?
The following table should give you a small idea, where to use c++ standard maps (only choose some) rather an array of pairs.
When would it be better to use c++ standard maps, and when to use an array of pairs?
You can always benchmark the efficiency of your data-structure, to see which one is apt for which situations.
Let's look into an example.
For instance, the followings are the bench-mark done in https://quick-bench.com/, for the insertion of 10000 elements in the beginning of each
std::vector<std::pair<int, std::string>>
) vsstd::map<int, std::string>
) vsstd::unordered_map<int, std::string>
).It turned out that using the C++ standard maps is faster/efficient than array of pairs for this kind of operations.
However, for the case of fewer entries (let's consider 10 elements), the same test shows that the usage of array of pairs (i.e. std::vector<std::pair<int, std::string>>
) would be faster/efficient than standard hash-map (i.e. std::unordered_map<int, std::string>
), and alomost equalent to standard map (i.e. std::map<int, std::string>
).
In short, the thumb rule is, always bench-mark and test for the operations what you will be having, before choosing the right data structure for best results. For further references have a look std::vector
, std::map
, std::unordered_map
, etc and their individual operations.
You use a Map when you are trying to do just that, map one item to another. An array of pairs is just storing a bunch of pairs and it provides no map functionality since you are still constrained by the index of the array.
A map is also handy for counting frequencies. Since a map (the standard map at least, there are variations of maps that allow duplicates) will not duplicate key values. You can use that to find the frequency of an element etc.
The main reason to use a map is when you want to find a value based on a key that you get to define. Its not so much about just storing items together as in a pair.
If you have a problem that lends itself to an associative mapping, you can use a tool like std::unordered_map
or std::map
(or Abseil's Swiss Tables or Boost's flat_map or similar) to make your job easier. Yes, you can always manage pairs in a array or std::vector
by yourself, and sometimes that is the right thing to do.
Reasons to use the standard containers include readability and maintainability of the code (most C++ programmers are familiar with the standard containers and algorithms to some degree), correctness (the standard containers have a long history and are far less likely to be buggy than home-spun data structure code), speed of development (reuse a solid design and focus on using the tools available rather than re-implementing the tool for marginal benefits), and possibly speed (more below).
Reasons you might want to roll your own or choose a non-standard container include special APIs on a particular container that make it a better fit in your use case, interfacing with code you don't control (e.g., a third-party library expecting pairs in contiguous memory), and performance requirements.
To wit: Different variants of a map or array will have different performance characteristics. There is the Big-O rating for each operation (lookup, insertion, deletion, etc.), and that is a good place to start. But that doesn't tell the whole story. For one, you may not use all these operations with the same frequency or urgency (e.g., you need lookups to be fast and don't care as much about deletions). For another, cache locality can end up dwarfing the theoretical Big-O value for an operation, but it all depends on your data characteristics, target machine's cache size, etc.
An array will be in contiguous memory and will tend (depending on size and usage patterns) to have better cache locality and overall performance than a std::map
which will tend to be spread out through memory. This is the same idea as a flat map, which has the interface of a map but the storage of an array underneath.
A hash-based map will have constant (O(1)
) lookup, but it can also suffer from a penalty for computing the hash (depending on the key type, which may not even be hashable; or hashing could be literally free for some key types), won't store items in a strictly ordered fashion for iteration (not a problem in many use cases), could devolve to O(n)
lookup time if the hasher has a lot of collisions, and depending on the implementation, may or may not be in contiguous memory that provides cache locality.
Of course you should guide your decisions with actual performance measurements rather than intuitions or guesses, which are notoriously inaccurate.
If you need a reasonable default, I'd start with std::unordered_map
, assuming your key is cheaply and easily hashable and you don't care about iteration order. Choose a different data structure when that map doesn't provide all the features you need or when your performance analysis says you need to do so.
Several points to note/consider:
While Jejo's answer is true in terms of asymptotic complexity, in many practical scenarios - a pair of vectors (or a vector of pairs) may actually be faster than using a map - even for insertion or erasure. See this presentation-fraction by Herb Sutter extols the virtues, in practice, of using arrays (std::vector
's) over other structures in various scenarios. It's a fun listen if you haven't had that drilled into you before...
There are a lot more than two options for you to choose from - depending on your actual needs:
std::unordered_map
, not std::map
.Have a look at this question:
vector or map, which one to use?
which compares using a map with a vector of pairs.
Standard-library maps are very slow as implementations! See this question: