2

The Hash table wiki entry lists its Big O as:

Search: O(n)
Insert: O(n)
Delete: O(n)

while a java HashMap is listed with Big O as:

get: O(1)
put: O(1)
remove: O(1)

Can someone plz explain why does the Big O differ between the concept and the implementation? I mean if there an implementation with a worst case of O(1) then why is there a possibility of O(n) in the concept?

Community
  • 1
  • 1
Songo
  • 5,618
  • 8
  • 58
  • 96
  • 1
    Basically you need to differentiate between "worst case" and "average" - the O(1) you're quoting for HashMap is average, not worst possible. – Jon Skeet Dec 18 '13 at 11:54
  • Maybee interesting for you: http://stackoverflow.com/questions/1055243/is-a-java-hashmap-really-o1 – kai Dec 18 '13 at 11:56
  • @JonSkeet I know this will sound really stupid, but isn't the Big O used to describe the worst case only? – Songo Dec 18 '13 at 12:00
  • @Songo: In theory, yes. In practice, many people (myself included) are a little less precise about it. – Jon Skeet Dec 18 '13 at 12:01
  • @JonSkeet then the correct thing to say is that a HashMap has a Big-Theta(1), but a Big-O(n)? – Songo Dec 18 '13 at 12:03
  • @Songo: Quite possibly, yes. – Jon Skeet Dec 18 '13 at 12:44
  • 3
    @JonSkeet No. In the strictest theoretical sense, both big O and big Theta describe *bounds on functions*, the difference being between upper (big O) and tight (big Theta) bounds. The worst/average/best case thing is orthogonal to the kind of bound, it describes *what* you bound: You can bound the time taken for the worst-case input, or the time taken for average inputs. I've used time as an example function here, but of course other metrics can be used as well: Space taken, how often some operation is done, the height of a tree, the number of links in a graph, the number of cache misses, etc. –  Dec 18 '13 at 14:21
  • @delnan: Makes sense. I was sufficiently unsure (and didn't have the time to check) that I limited myself to "quite possibly" rather than a more confident "yes" :) – Jon Skeet Dec 18 '13 at 14:30
  • @delnan then it's correct to use Big O to describe the upper bound for the best, average & worst cases for a certain algorithm? For example one might say the worst case scenario for searching for an element in an array is upper bounded by O(n) and the best case scenario is upper bounded by O(1)? – Songo Dec 18 '13 at 14:38
  • @Songo Fully correct. –  Dec 18 '13 at 14:39
  • @delnan then why do some people use Big Omega when talking about the best case scenario? why don't we use Big O everywhere instead? – Songo Dec 18 '13 at 14:42
  • @Songo Example? Big Omega is a *lower* bound. Giving a lower bound for the best case might occasionally be useful ("even in the best case it still takes quadratic time"), but again it's orthogonal. As for using big O everywhere: A lot of people *do* use the letter O for different kinds of bounds, as Jon Skeet hinted at. We differentiate between upper, tight and lower bounds because tight bounds are what you usually want (and get), but sometimes the analysis is too complicated and you only get upper/lower bounds ("it takes *at most* linear time, but possibly less"). –  Dec 18 '13 at 15:28
  • @delnan that's a great explanation. You should post all ur comments as an answer and I'll accept it ;) – Songo Dec 18 '13 at 18:50

3 Answers3

1

The worst case is O(n) because it might be possible that every entry you put into the HashMap produces the same hash value (lets say 10). This produces a conflict for every entry because every entry is put at HashMap[10]. Depending on what collision resolution strategy was implemented, the HashMap either creates a list at the index 10 or moves the entry to the next index.

Nevertheless when the entry should be accessed again, the hash value is used to get the initial index of the HashMap. As it is 10 in every case, the HashMap has to resolve this.

mvieghofer
  • 2,846
  • 4
  • 22
  • 51
1

Because there's a difference between worst case and average case, and even wikipedia lists the O(1) complexity for the avarage case. Java's HashMap is exactly the same as wikipedia's Hash table. So it is just a documentation issue.

Basically, hash tables compute a numerical value from the object you want to store. That numerical value is roughly used as an index to access the location to store the object into (leading to O(1) complexity). However, sometimes certain objects may lead to the same numerical value. In this case those objects will be stored in a list stored in the corresponding location in the hash map, hence the O(n) complexity for the worst case.

Raffaele Rossi
  • 1,079
  • 5
  • 11
  • well several blogs ([example1](http://www.javacodegeeks.com/2011/04/simple-big-o-notation-post.html), [example2](http://www.javaexperience.com/time-complexity-of-collection-classes/)) mention that it has Big-O(1), shouldn't the average case be expressed as Big-Theta(1) instead? – Songo Dec 18 '13 at 12:13
0

I'm not sure where you found the reported complexity of a java HashMap, but it is listing the average case, which matches what wikipedia states on the page you linked.

Andrew Stubbs
  • 4,322
  • 3
  • 29
  • 48
  • well several blogs ([example1](http://www.javacodegeeks.com/2011/04/simple-big-o-notation-post.html), [example2](http://www.javaexperience.com/time-complexity-of-collection-classes/)) mention that it has Big-O(1), shouldn't the average case be expressed as Big-Theta(1) instead? – Songo Dec 18 '13 at 12:12