Algorithm to find all duplicates in two arrays

Question

What is the best algorithm to find all duplicates in two arrays?

What I can think of is brute force algorithm.

Comparing two arrays directly, and once found the same number, store it in the auxiliary array. But the time complexity is O(n²).

Possible duplicate http://stackoverflow.com/questions/245509/algorithm-to-tell-if-two-arrays-have-identical-members?rq=1 — , Oct 13 '12 at 08:07
Note: You are basically looking for a set intersection. [There was a question about it yesterday](http://stackoverflow.com/q/12863904/572670) (with more restrictions) — amit, Oct 13 '12 at 08:12

score 6 · Answer 1 · answered Oct 13 '12 at 08:07

6

add the numbers of the first array to a hash structure (hashset)
for each number in the second array, if in hashset, add to final array, if not ignore

That will be O(n+m) (sizes of the arrays).

answered Oct 13 '12 at 08:07

assylias

321,522
82
660
783

Can you please give an example explaining what O,n,m are. – MarzSocks Jun 13 '15 at 05:53
O(n+m) space space complexity . O(1) could be better. with O(nlogn) time complexity when the arrays are big – meesinlid Jun 26 '20 at 00:23

score 3 · Answer 2 · answered Oct 13 '12 at 08:06

There is an O(n log n) algorithm.

Sort arr1 and arr2 using quick sort or merge sort

i = 0
j = 0
found = 0
while i < arr1.length and j < arr2.length:
      if (arr1[i] == arr2[j])
          found = found + 1
          i = i + 1
          j = j + 1
      else if (arr1[i] < arr2[j])
          i = i + 1
      else
          j = j + 1

score 2 · Answer 3 · answered Oct 13 '12 at 08:07

2

Sort the array, then go through both arrays at the same time, always advance an array when the current element is smaller than the other. Complexity: O(nlogn)

answered Oct 13 '12 at 08:07

phant0m

16,595
5
50
82

Bartosz Przybylski · Answer 4 · 2012-10-13T08:20:21.690

1

You can identify duplicates in arrays in O(n) time. This approach uses hashmap, heres pseudocode:

// a and b are input arrays
HashMap h
for e in a:
    if h[e] == 0:
        h[e]++
for e in b:
    if h[e] != 0:
        h[e]++
for e in h:
    if e > 1:
        print "Duplicate" + e

Hashmap[Element] is syntactic sugar and means:

Get element with key e, if no present, then create it with initializer 0

edited Oct 13 '12 at 08:20

answered Oct 13 '12 at 08:06

Bartosz Przybylski

1,628
10
15

This will not work if a number occurs twice in the same array – heijp06 Oct 13 '12 at 08:10

Algorithm to find all duplicates in two arrays

4 Answers4