24

How efficient is the find() function on the std::map class? Does it iterate through all the elements looking for the key such that it's O(n), or is it in a balanced tree, or does it use a hash function or what?

Publius
  • 1,184
  • 2
  • 10
  • 27
  • 4
    There is documentation for STL, and it usually states complexity. http://www.cplusplus.com/reference/stl/map/find/ – Don Reba Apr 01 '12 at 03:58
  • See: [What are the complexity guarantees of the standard containers](http://stackoverflow.com/q/181693/14065) – Martin York Apr 01 '12 at 11:37

3 Answers3

47

Log(n) It is based on a red black tree.

Edit: n is of course the number of members in the map.

David D
  • 1,571
  • 11
  • 12
  • Although this is true to an extent, there is a limitation in larger maps. If you're looking at very large datasets I'd recommend also looking at alternative associative array containers. – std''OrgnlDave Apr 01 '12 at 04:16
  • 2
    True it is log(n). Not true it is based on red/black trees. The standard defines the operation to have a max complexity of log(n) and the most affective way of achieving this is to use red/black trees (though this is not a requirement). Thus you have your cart before the horse. – Martin York Apr 01 '12 at 11:33
  • @OrgnlDave: It is always true (not to an extent). – Martin York Apr 01 '12 at 11:34
  • @LokiAstari No it isn't. When you get bigger data that doesn't fit inside several batches of cache, like let's say 100MB of data, an std::map is NOT going to perform log(n), and there are much, much better options. It is log(n) in a perfect world where there are no memory access constraints, etc., and STL containers do not live in that perfect world – std''OrgnlDave Apr 01 '12 at 18:43
  • 2
    @OrgnlDave: Yes it will. Standard guarantees it. I don't think you understand what complexity is (your statement may apply to absolute speed but not complexity). And 100MB is still small on modern machines, it is unlikely you could actually measure the difference. – Martin York Apr 01 '12 at 18:46
  • @LokiAstari Now you are just trolling. Very funny :-P But I will indulge you; say you are using std::string of minimum length k, you actually get *at least* O(log(n) * k) minimum with similar prefix strings. All the standard guarantees is that find() will perform log(n) comparison operations, not that it will execute with log(n) complexity or speed. – std''OrgnlDave Apr 01 '12 at 18:50
  • I meant *around*. Sigh. find() may be log(n) but it itself performs operations. It is true that it is always log(n) if you take n to be the number of operations it would do doing a linear search (one list item after another), but n is defined as the number of elements, and that makes it more than log(n) – std''OrgnlDave Apr 01 '12 at 18:57
  • 1
    @OrgnlDave: **NO** it does not. – Martin York Apr 01 '12 at 19:51
  • 2
    @std''OrgnlDave: I think you should read [this wikipedia page](https://en.wikipedia.org/wiki/Big_O_notation). `std::map::find` is definitely `log(n)`. The very fact that you mention cache and "real world constraints" tells us that you have a misunderstanding of the big O notation. What complexity do you believe `std::map::find` has? – Mooing Duck Jul 10 '13 at 02:41
  • if map keys are string , then how long it take to search a key. – Sanjit Prasad Jul 04 '18 at 17:41
30

std::map and std::set are implemented by compiler vendors using highly balanced binary search trees (e.g. red-black tree, AVL tree).

As correctly pointed out by David, find would take O(log n) time, where n is the number of elements in the container.

But that's with primitive data types like int, long, char, double etc., not with strings.

If std:string, lets say of size 'm', is used as key, traversing the height of the balanced binary search tree will require log n comparisons of the given key with an entry of the tree.

When std::string is the key of the std::map or std::set, find and insert operations will cost O(m log n), where m is the length of given string that needs to be found.

Arif Ali Saiyed
  • 657
  • 7
  • 11
  • I was going to upvote this for pointing out that the individual comparisons are not necessarily O(1). But then you made the edit about Java, which I don't understand. The hash key returned by `hashCode()` is not a unique identifier, so you still need to make an O(m) string comparison when you compare two keys. – jogojapan Jul 10 '13 at 02:36
  • Also, generating the hashcode is still O(m), so not only is it not faster, using the hashcodes would be _slower_ (assuming they aren't cached) – Mooing Duck Jul 10 '13 at 02:46
  • 1
    @jogojapan Thanks for pointing out java.lang.String.hashCode() thing, corrected my answer by removing the javaj portion and sticking to question being asked. Also raised a [question] ( http://stackoverflow.com/questions/17569651/are-string-hashcode-in-java-pre-computed) – Arif Ali Saiyed Jul 10 '13 at 11:49
4

It does not iterate all elements, it does a binary search (which is O(log(n))). It use operator< or a comparator to do the search.

If you want a hash map, you can use a std::unordered_map (added on C++-0x), which use a hash function and on average (depending on the hash function and data you provide) find() will be O(1).

fbafelipe
  • 4,862
  • 2
  • 25
  • 40
  • @NicolBolas: I remember reading somewhere that it wasn't mandatory a balaced tree, thanks for your comment. Fixed my anwer. – fbafelipe Apr 01 '12 at 06:56