0

I need to understand the architecture and functions of hash set very well.

Compared with STL::set, what is the advantage of hash set compared with STL::set ? I think the O(1) time to do search. If this is true, why not to use hash table ? Their difference is duplicated element ? or others ?

For STL::set, the search time of smallest/largest is also O(1), because it has been ordered.

A hash set is not a binary search tree, how to find the smallest or largest element with O(1) ?

After reading What is the difference between set and hashset in C++ STL?

I cannot find the answer .

My idea:

When should use hash set not hash table ?

STL::set is ordered set. So, it is O(1) to get smallest/largest element.

What if for hash set ? it is ordered ?

thanks

Community
  • 1
  • 1
user1002288
  • 4,860
  • 10
  • 50
  • 78
  • 1
    What makes you think you *can* find the largest or smallest element in O(1)? – Jon Skeet Dec 26 '11 at 18:46
  • cant understand, you want only smallest/largest elements in O(1) or lookup in O(1) or both to be O(1) ? – jackdoe Dec 26 '11 at 18:51
  • How does a hash set work? (It is essentially the same as how a hash table works.) That is the answer. The first answer in the linked post answers the questions here: "essentially random", "[not] ordered". –  Dec 26 '11 at 18:54

2 Answers2

3

A hash set is not a binary search tree, how to find the smallest or largest element with O(1)?

This is exactly one of the key differences: you can't find the smallest/largest element in a hash set in constant time. You can, of course, do it in O(n) time by scanning the entire set.

Another key difference is that iterating over a hash set does not return the elements in sorted order.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
0

A hash set is basically a hash table without stored values (only keys), and std::set in C++ are implemented as a balanced binary search tree.

You should read on the differences in some algorithm/computer science book, as you seem to have some basic misconceptions, for example, in a binary search tree the cost of finding the smallest/largest element is logarithmic O(log N), not constant O(1).

Depending on the operations that you need to perform most often, either data structure will be more appropriate. If you need to find the smallest element quite often, then a std::set will perform the operation in O(log N), but with a hash table you will need to check all elements and that means linear time O(N). If on the other hand that operation is not common and regular lookups (is element a in the set?) are more common, the constant time lookup of a hash set will be better than O(log N) lookup in a set.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • 1
    Why would it not be O(1) to find the smallest/ largest element in a `std::set`? `set.begin()` and `set.rbegin()` are your friends. – Xeo Dec 26 '11 at 20:47
  • @Xeo, unless the set contains a specific optimization a tree might require O(log n) to find the first or last element. – Mark Ransom Dec 26 '11 at 20:53
  • @Xeo, you might be right - according to this reference, set.begin and set.rbegin are required to have constant complexity. I'm not sure whether to trust the reference though. http://www.cplusplus.com/reference/stl/set/begin/ – Mark Ransom Dec 26 '11 at 21:03
  • @Mark: At least one recent standard draft requires constant complexity for `begin` and `end` for all containers. – Philipp Dec 26 '11 at 22:53
  • @Mark: Both C++03 and C++11 standards require constant complexity in the `c.begin()` and `c.rbegin()` for container and reversible container requirements (see C++03 §23.1 p5 & p9 and C++11 §23.2.1 p4 & p9 for requirements, aswell as C++03 §23.3.3 p2 and C++11 §23.4.6.1 p2 for which requirements `std::set` satisfies). – Xeo Dec 27 '11 at 06:58
  • @Xeo: I have not looked in detail the new standard. The `O(log N)` time is the standard time to find the first/last element in a balanced binary tree. Of course, an implementation can decide to cache those two specific values --and it seems that the standard mandates it. – David Rodríguez - dribeas Dec 27 '11 at 09:36