1

If I make analogy to Oracle database : they have the notion of "index" and "hint" to speed the execution of a query. So I want to make the analogy to Java : is it possible to make the execution of the "contains" method of a collection to be quick ?

pheromix
  • 18,213
  • 29
  • 88
  • 158

2 Answers2

4

is it possible to make the execution of the "contains" method of a collection to be quick ?

Yes, use a collection with a faster contains method.

e.g.

HashSet -> O(1)
SortedSet -> O(ln N)
List -> O(N)

The big-O time complexity indicates how the worst case time for an operation as the size of the collection grows. Lower time complexity is faster for modest to large collections.

Note: this only tells you how the collection scales. If the collection is very small like 1 or 2, then a List might be the fastest.

The Big-O notation is well explained What is a plain English explanation of "Big O" notation?

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
1

It depends on what do you want to do with that collection. These are the standard options:

  1. If the order of insertion doesn't matter, then you can use a HashSet. Once you have this Set set up, the contains() method will run in O(constant) time.
  2. Alternatively, you can use a LinkedHashSet. It is almost the same as the HashSet, but it maintains a LinkedList on the top of the Set so you can maintain the order of original insertion.
  3. If you want to maintain the natural ordering of your items, you can use a TreeSet. Its contains() method will return in O(log n) time which is still fairly quick.

If none of these is good for you, then you can have your collection, assign another HashSet and maintain them together. However, I'm afraid that the performance gain isn't justified at most of the cases.

Tamas Rev
  • 7,008
  • 5
  • 32
  • 49
  • what does mean the `O` notation ? – pheromix Jun 07 '17 at 09:00
  • It's the notation to measure complexity. It's well explained https://stackoverflow.com/q/487258/4723795 – xenteros Jun 07 '17 at 09:02
  • 1
    The big-oh notation shows the time required for computing the results *compared to the input size*. I.e. `O(n)` means that if the list becomes twice as big, the `contains()` will run twice as much. `O(constant)` means that the run time is independent from the `Set` size - this is the very cool. `O(log n)` means that if the `TreeSet` becomes twice as big, the `contains` will run sg like 1.2 times as much. – Tamas Rev Jun 07 '17 at 09:05