We will assume that both arrays are roughly the same length, so that n is used to describe the length of both of them. (Otherwise we need to introduce a new variable.)
Overview of alternatives:
Linear search: If we store the second array as an unsorted array/list, then each search takes O(n) time and the total time is O(n2).
Binary search: If we sort the second array this takes O(n log n) time. Each query takes O(log n) time, for a total of O(n log n) time.
Balanced tree: If we store the second array in a TreeSet
, each search takes O(log n) time for a total of O(n log n) time.
Hash table: If we store the second array in a HashSet
like you mentioned, each insertion takes O(1) amortized time and each search takes O(1) time, for a total of O(n) time as wanted.
Bit vector: If we create a 232-element bit vector to cover all the possible Java int
or Integer
values, we can add each element of the second array and test for existence of a value in O(1) time, for a total of O(n) time as wanted. However there is a very expensive constant term for the initialization of 232 elements.
Lower bound argument: The second array has length O(n). It is impossible to accurately tell if a value is in the second array or not if we haven't scanned the entire array. Thus the absolute minimum time to solve this problem is Ω(n). This means that O(n) is the best we can possibly do.