Efficient algorithm to determine if two sets of numbers are disjoint

Question

Practicing for software developer interviews and got stuck on an algorithm question.

Given two sets of unsorted integers with array of length m and other of 
length n and where m < n find an efficient algorithm to determine if 
the sets are disjoint. I've found solutions in O(nm) time, but haven't 
found any that are more efficient than this, such as in O(n log m) time.

O(n+m) is the best you could do. Also, O(n) is more efficient than O(n log m) — Daniel Williams, Jul 08 '14 at 19:44
If you can't immediately think of an algorithm to solve something, a good first step is to try sorting the inputs. :) — j_random_hacker, Jul 08 '14 at 19:45
@j_random_hacker I thought about sorting and couldn't figure out anything that could be sorted and do comparisons while coming in at O(n log m) time. — MNRC, Jul 08 '14 at 19:46
When you have sorted them, compare the first 2 elements. There are 3 possible cases: (1) They're equal, in which case obviously the two lists share that element and you're done; (2) the element from list 1 is smaller; (3) the element from list 2 is smaller. In cases (2) and (3), notice that the smaller element must be smaller than *all* the elements in the other list, so you can throw it away and start again! — j_random_hacker, Jul 08 '14 at 19:51

score 9 · Accepted Answer · answered Jul 08 '14 at 19:47

Using a datastructure that has O(1) lookup/insertion you can easily insert all elements of first set.

Then foreach element in second set, if it exists not disjoint, otherwise it is disjoint

Pseudocode

function isDisjoint(list1, list2)
    HashMap = new HashMap();
    foreach( x in list1)
        HashMap.put(x, true);

    foreach(y in list2)
        if(HashMap.hasKey(y))
             return false;
    return true;

This will give you an O(n + m) solution

score 4 · Answer 2 · answered Jul 08 '14 at 19:46

4

Fairly obvious approach - sort the array of length m - O(m log m). For every element in the array of length n, use binary search to check if it exists in the array of length m - O(log m) per element = O(n log m). Since m<n, this adds up to O(n log m).

answered Jul 08 '14 at 19:46

Pradhan

16,391
3
44
59

score 2 · Answer 3 · answered Jul 08 '14 at 19:49

2

Here's a link to a post that I think answers your question.

3) Sort smaller O((m + n)logm)

Say, m < n, sort A
Binary search for each element of B into A

Disadvantage: Modifies the input

answered Jul 08 '14 at 19:49

Intrepid Traveller

95
6

score 2 · Answer 4 · answered Jul 08 '14 at 20:03

Looks like Cheruvian beat me to it, but you can use a hash table to get O(n+m) in average case:
*Insert all elements of m into the table, taking (probably) constant time for each, assuming there aren't a lot with the same hash. This step is O(m)
*For each element of n, check to see if it is in the table. If it is, return false. Otherwise, move on to the next. This takes O(n).
*If none are in the table, return true.

As I said before, this works because a hash table gives constant lookup time in average case. In the rare event that many unique elements in m have the same hash, it will take slightly longer. However, most people don't need to care about hypothetical worst cases. For example, quick sort is used more than merge sort because it gives better average performance, despite the O(n^2) upper bound.

Efficient algorithm to determine if two sets of numbers are disjoint

4 Answers4

Linked

Related