3

I have a small question: I have a Collection with 5-10 Integers. They do not change or anything else, but they are constanly there. So my question which Collection whould be the best/fastest for the contains method: ArrayList, LinkedList, Vector, HashSet or...?

apxcode
  • 7,696
  • 7
  • 30
  • 41
user3434681
  • 85
  • 1
  • 6
  • 5
    With only 5-10 integers, i don't think your performance is going to be effected at all whether you went with either of those. – austin wernli Nov 11 '14 at 19:32
  • Your collection is too small for it to make any difference. – apxcode Nov 11 '14 at 19:32
  • 2
    But here is a good reference if you want to find out yourself. http://bigocheatsheet.com/ – austin wernli Nov 11 '14 at 19:33
  • 1
    Actually IMO (I'm going _against the tide_) if collection is heavily used then yes, **there is a (big) difference**. Let's imagine, for example, a collection of local variables in a parser (just to pick a school book example). More often than not such heavily optimized collection doesn't exist unless general requirements match your scenario (yes, in that case, HashSet is good but it's **not the optimal one**). Assuming you really need it then: How it's searched? How often? Any valuable search pattern? Is there one (or more elements) searched more often than others? Is this dynamic or static? – Adriano Repetti Nov 11 '14 at 19:50

2 Answers2

1

Due to the way Computers load data (CPU optimizations assume you are going to use adjacent memory addresses)
Using small arrays is probably going to be the quickest solution here.

If the actual numbers are going to be small number, Java will optimize Integer (non primitives) so using an ArrayList should produce similar results to using an int[].
If you are using larger number( not between -128 to 127), the int[] array will produce better results.

The overhead of using a Hashset is probably going to cost more then actually running 5-10 compares on an int[] array.

And you may want to consider putting these numbers into a series of IF conditions.

All of the above, will only have an impact if you run these compares at least 100,000 times a second. Anything less then that, it doesn't really matter.

To summarize:
Code << int[] array << ArrayList << HashSet

  • I think this is a good answer, and it agrees with my intuition. But then, you really won't know unless you measure. – Sjlver Nov 11 '14 at 20:12
  • 1
    @Sjlver sorry but I think it's a "bar" answer. In general it's true but (of course assuming OP really needs this small optimization) it's too generic to be useful. Is array faster than HashSet? Maybe, what if most searched element is always last one? Are array/HashSet a good choice? Maybe, **we don't know** according to which criterion collection is searched then **we can't answer**. Any useful precondition? For example: collection contains integers from 0 to (relatively small) N. In this case a bit array is even faster. And so on (IMO). – Adriano Repetti Nov 12 '14 at 11:08
1

I'd bet that the best results can be achieved via perfect hashing, assuming you want to spend some non-negligible time creating the set. There are some general ways for computing such a hashing, but in your case, they might be inferior to a simple array scan due to their fixed overhead.

In case the creation time does not matter, consider an algorithm like in this answer of mine, where an expression of the form

TABLE[(MULTIPLIER * c) >> SHIFT] == c

gets used to test if c is contained in the set. Finding a proper MULTIPLIER takes some time, but the test is extremely fast.

Community
  • 1
  • 1
maaartinus
  • 44,714
  • 32
  • 161
  • 320