11

Today, an interviewer asked me this question. My immediate response was that we could simply do a linear search, comparing the current element with the previous element in the array. He then asked me how the problem could be solved in less-than-linear time.

Assumptions

  • The array is sorted
  • There is only one duplicate
  • The array is only populated with numbers [0, n], where n is the length of the array.

Example array: [0,1,2,3,4,5,6,7,8,8,9]

I attempted to come up with a divide-and-conquer algorithm to solve this, but I'm not confident that it was the right answer. Does anyone have any ideas?

GRardB
  • 724
  • 1
  • 6
  • 14
  • 1
    your example includes only numbers `[0,n-2]` (there are no `10`, `11`) Is it just an example or a general rule? – jfs Mar 17 '12 at 18:33

7 Answers7

25

Can be done in O(log N) with a modified binary search:

Start in the middle of the array: If array[idx] < idx the duplicate is to the left, otherwise to the right. Rinse and repeat.

BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
7

If no number is missing from the array, as in the example, it's doable in O(log n) with a binary search. If a[i] < i, the duplicate is before i, otherwise it's after i.

If there is one number absent and one duplicate, we still know that if a[i] < i the duplicate must be before i and if a[i] > i, the absent number must be before i and the duplicate after. However, if a[i] == i, we don't know if missing number and duplicate are both before i or both after i. I don't see a way for a sublinear algorithm in that case.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • I'm very late, but if you allow missing numbers, it's indeed impossible (assuming you can't read an arbitrarily large number of cells in O(1)). Suppose we consider entries of size n+1 (n>=2) and we limit ourselves to this subset of entries : {[0,0,2,…,n], [0,1,1,3,…,n], ..., [0,1,…,k,k,k+2,…,n], ..., [0,1,…,n-1,n-1]}. Let's assume you already know the contents of any of up to (n-2) cells and that they were pairwise distinct, there's still at least 2 possibilities and you can't discriminate any. Therefore you need to read at least (n-1) cells to decide which number is the duplicate. – Caninonos Jan 16 '18 at 15:39
6

I attempted to come up with a divide-and-conquer algorithm to solve this, but I'm not confident that it was the right answer.

Sure, you could do a binary search.

If arr[i/2] >= i/2 then the duplicate is located in the upper half of the array, otherwise it is located in the lower half.

while (lower != upper)
    mid = (lower + upper) / 2
    if (arr[mid] >= mid)
        lower = mid
    else
        upper = mid-1

Since the array between lower and upper is halved in each iteration, the algorithm runs in O(log n).

ideone.com demo in Java

aioobe
  • 413,195
  • 112
  • 811
  • 826
1

Difference between sum of given array elements and sum of 0 to n-1 natural numbers gives you the duplicated element. Sum of 0 to n-1 elements is (N * N-1)/2 example array is [0,1,2,3,4,5,6,7,8,8,9] sum of 0 to 9 natural numbers is : 45 sum of given array elements : 53 53-45 = 8 Which is the duplicated element

chenna
  • 159
  • 1
  • 4
  • 14
1
#include <bits/stdc++.h>
using namespace std;

int find_only_repeating_element(int arr[] , int n){
int low = 0;
int high = n-1;
while(low <= high){
    int mid = low + (high - low)/2;
    if(arr[mid] == arr[mid + 1] || arr[mid] == arr[mid - 1]){
        return arr[mid];
    }
    if(arr[mid] < mid + 1){
        high = mid - 2;
    }else{
        low = mid + 1;
    }
   }
   return -1;
}

int main(int argc, char const *argv[])
{
int n , *arr;
cin >> n;
arr = new int[n];
for(int i = 0 ; i < n ; i++){
    cin >> arr[i];
}
    cout << find_only_repeating_element(arr , n) << endl;
    return 0;
}
  • 1
    Welcome to SO. Please consider adding some explanation to your code when answering a question. – f-CJ Oct 09 '18 at 17:02
0

How about that? (recursion style)

public static int DuplicateBinaryFind(int[] arr, int left, int right)
{
   int dup =0;

   if(left==right)
   {
      dup = left;
   }
   else
   {
        int middle = (left+right)\2;
        if(arr[middle]<middle)
        {
          dup = DuplicateBinaryFind(arr,left, middle-1);

        }
        else
        {
           dup = DuplicateBinaryFind(arr, middle+1, right);
        }
   }

   return dup;

}
JavaSa
  • 5,813
  • 16
  • 71
  • 121
-1

The example array is a little bit different from your question. Since n is the length of array and there are one and only duplicate in array, the value of each element in array should be in [0,n-1].

If that is true, then this question is the same one with How to find a duplicate element in an array of shuffled consecutive integers?

The following code should find the duplicate in O(n) time and O(1) space.

public static int findOnlyDuplicateFromArray(int[] a, boolean startWithZero){
    int xor = 0;
    int offset = 1;
    for(int i=0; i < a.length; i++){
        if(startWithZero)
            xor = xor ^ (a[i] + offset) ^ i;
        else
            xor = xor ^ a[i] ^ i;
        }
        if(startWithZero)
            xor = xor - offset;
    return xor;
}
Community
  • 1
  • 1