Why is the cost of a hash lookup O(1) when evaluating the hash function might take more time than that?

Question

A HashMap (or) HashTable is an example of keyed array. Here, the indices are user-defined keys rather than the usual index number. For example, arr["first"]=99 is an example of a hashmap where theb key is first and the value is 99.

Since keys are used, a hashing function is required to convert the key to an index element and then insert/search data in the array. This process assumes that there are no collisions.

Now, given a key to be searched in the array and if present, the data must be fetched. So, every time, the key must be converted to an index number of the array before the search. So how does that take a O(1) time? Because, the time complexity is dependent on the hashing function also. So the time complexity must be O(hashing function's time).

`O(hashing function's time)=O(1)` because usually hash of string does not depend on the whole string, just on first/last `O(1)` bytes of it. — Egor Skriptunoff, Jun 12 '15 at 13:42
http://stackoverflow.com/questions/3949217/time-complexity-of-hash-table AND http://stackoverflow.com/questions/6873039/complexity-of-hashing AND http://cs.stackexchange.com/questions/249/when-is-hash-table-lookup-o1 — Amirhossein Mehrvarzi, Jun 12 '15 at 13:42
@EgorSkriptunoff no that is not the reason: the work of hashing of a (relatively short) string compared to a large amount of work `N` (number of entries in the map) diminishes as `N -> inf` - i.e. if the length of the string keys is bounded by `k` hashing any key is constant — BeyelerStudios, Jun 12 '15 at 13:45
Possible duplicate of [Can hash tables really be O(1)?](http://stackoverflow.com/questions/2771368/can-hash-tables-really-be-o1) — DavidRR, May 14 '17 at 17:50

score 2 · Accepted Answer · answered Jun 12 '15 at 22:02

When talking about hashing, we usually measure the performance of a hash table by talking about the expected number of probes that we need to make when searching for an element in the table. In most hashing setups, we can prove that the expected number of probes is O(1). Usually, we then jump from there to "so the expected runtime of a hash table lookup is O(1)."

This isn't necessarily the case, though. As you've pointed out, the cost of computing the hash function on a particular input might not always take time O(1). Similarly, the cost of comparing two elements in the hash table might also not take time O(1). Think about hashing strings or lists, for example.

That said, what is usually true is the following. If we let the total number of elements in the table be n, we can say that the expected cost of performing a looking up the hash table is independent of the number n. That is, it doesn't matter whether there are 1,000,000 elements in the hash table or 10¹⁰⁰ - the number of spots you need to prove is, on average, the same. Therefore, we can say that the expected cost of performing a lookup in a hash table, as a function of the hash table size, is O(1) because the cost of performing a lookup doesn't depend on the table size.

Perhaps the best way to account for the cost of a lookup in a hash table would be to say that it's O(T_hash + T_eq), where T_hash is the time required to hash an element and T_eq is the time required to compare two elements in the table. For strings, for example, you could say that the expected cost of a lookup is O(L + L_max), where L is the length of the string you're hashing and L_max is the length of the longest string stored in the hash table.

Hope this helps!

I understand that T_hash can be significant speacially if L is large, however, I assume that a good implementation of a hash table won't compare strings during lookup only hash comparisons. I don't think that the length of the strings stored in the hash table should matter. — Albino Cordeiro, Mar 14 '18 at 00:27
@AlbinoCordeiro At some point you have to compare the strings themselves, since otherwise if you have a string that has a true hash collision with another string in the table you'll incorrectly report that the string is present even though it's not really there. — templatetypedef, Mar 14 '18 at 02:01

Why is the cost of a hash lookup O(1) when evaluating the hash function might take more time than that?

1 Answers1