0

I have an upcoming interview and was looking through some technical interview questions and I came across this one. It is asking for the time complexity for the insertion and deletion functions of a hash map. The consensus seems to be that the time complexity is O(1) if the has map is distributed evenly but O(n) if they are all in the same pool.

I guess my question is how exactly are hash maps stored in memory? How would these 2 cases happen?

steveclark
  • 537
  • 9
  • 27
  • 1
    That depends on the implementation - http://en.wikipedia.org/wiki/Hash_table gives some options. – Jon Skeet May 07 '15 at 08:35
  • Okay, but how would we have control over the implementation? From my understanding all we are doing is using a key word like `map` or `dict` to create the hash map and then using something like `myMap["key"] = "value"` to populate the map. – steveclark May 07 '15 at 08:42
  • I don't understand what you're trying to get at. If you're asking how one specific platform/library implements it, you should say which one you care about. If you only care about the Python implementation, you should have used the `python` tag and specified that in the question... – Jon Skeet May 07 '15 at 08:44
  • Well I am trying to figure out how they got to the answer they are giving in the link I provided. It is not platform specific. I just mentioned `dict` because that is Python's map keyword. – steveclark May 07 '15 at 08:47
  • But the question you asked is "how exactly they're stored in memory" which is implementation-specific. You don't need to *control* the implementation you use to be aware of there being different implementation options. – Jon Skeet May 07 '15 at 08:54

1 Answers1

1

One answer on your linked page is:

insertion always would be O(1) if even not properly distributed (if we make linked list on collision) but Deletion would be O(n) in worst case.

This is not a good answer. A generalized answer to time complexity for a hashmap would come to a similar statement as the Wikipedia article on hash tables:

Time complexity
in big O notation

          Average   Worst case
Space     O(n)      O(n)
Search    O(1)      O(n)
Insert    O(1)      O(n)
Delete    O(1)      O(n)

To adress your question how hash maps are stored in memory: There are a number of "buckets" that store values in the average case, but must be expanded to some kind of list when a hash collision occurs. Good explanations of hash tables are the Wikipedia article, this SO question and this C++ example.

The time complexity table above is like this because in the average case, a hash map just looks up and stores single values, but collisions make everything O(n) in worst case, where all your elements share a bucket and the behaviour is similar to the list implementation you chose for that case.

Note that there are specialized implementations that adress the worst cases here, also described in the Wikipedia article, but each of them has other disadvantages, so you'll have to choose the best for your use case.

Community
  • 1
  • 1
schnaader
  • 49,103
  • 10
  • 104
  • 136
  • Very helpful, thank you. I guess I am a little more concerned about how these different implementations occur. As I mentioned in a previous comment since all we have to do to create a map is use a keyword wouldn't it be up to the compiler how keys in the map are stored? I just don't understand how the keys would be stored evenly or in a single bucket when all I am doing is using a keyword to create the map and then using the map class's functions to add or remove items from the map – steveclark May 07 '15 at 09:06
  • What compiler/lib are you talking about? Most hash table implementations require you to provide a hash function (C++ hash_map, e.g.). If you're talking about Java's `HashMap`, there a default `hashCode` method is provided with objects, but you can override it. –  May 08 '15 at 04:18