0

What is the best algorithm to find all duplicates in two arrays?

What I can think of is brute force algorithm.

Comparing two arrays directly, and once found the same number, store it in the auxiliary array. But the time complexity is O(n2).

Bartosz Przybylski
  • 1,628
  • 10
  • 15
J.L
  • 592
  • 5
  • 17
  • 35
  • Possible duplicate http://stackoverflow.com/questions/245509/algorithm-to-tell-if-two-arrays-have-identical-members?rq=1 –  Oct 13 '12 at 08:07
  • Note: You are basically looking for a set intersection. [There was a question about it yesterday](http://stackoverflow.com/q/12863904/572670) (with more restrictions) – amit Oct 13 '12 at 08:12

4 Answers4

6
  • add the numbers of the first array to a hash structure (hashset)
  • for each number in the second array, if in hashset, add to final array, if not ignore

That will be O(n+m) (sizes of the arrays).

assylias
  • 321,522
  • 82
  • 660
  • 783
3

There is an O(n log n) algorithm.

Sort arr1 and arr2 using quick sort or merge sort

i = 0
j = 0
found = 0
while i < arr1.length and j < arr2.length:
      if (arr1[i] == arr2[j])
          found = found + 1
          i = i + 1
          j = j + 1
      else if (arr1[i] < arr2[j])
          i = i + 1
      else
          j = j + 1
2

Sort the array, then go through both arrays at the same time, always advance an array when the current element is smaller than the other. Complexity: O(nlogn)

phant0m
  • 16,595
  • 5
  • 50
  • 82
1

You can identify duplicates in arrays in O(n) time. This approach uses hashmap, heres pseudocode:

// a and b are input arrays
HashMap h
for e in a:
    if h[e] == 0:
        h[e]++
for e in b:
    if h[e] != 0:
        h[e]++
for e in h:
    if e > 1:
        print "Duplicate" + e

Hashmap[Element] is syntactic sugar and means:

Get element with key e, if no present, then create it with initializer 0

Bartosz Przybylski
  • 1,628
  • 10
  • 15