0

There are 1002 numbers in an array and two numbers are the same. How would you find the same number in this array efficiently or is there an efficient algorithm?

Here is my algorithm:

for i in range(0, 1002):
    for j in range(i+1, 1002):
        if(a[i]==a[j]):
           return a[i]
mc110
  • 2,825
  • 5
  • 20
  • 21
Figen Güngör
  • 12,169
  • 14
  • 66
  • 108

4 Answers4

2

This should work!

#include<stdio.h>
#define RANGE 1000000001
int main()
{
  int arr[1002];//your all numbers;
  short int hash[RANGE];//Your range of numbers 
  long long int i;
  for(i = 0; i < RANGE; i++)
    hash[i] = 0;
  for(i = 0; i < 1002; i++)
    {
      if(hash[arr[i]] != 0)
    {
      printf("Duplicate number is:%d\n",arr[i]);
      break;
    }
      else
    hash[arr[i]]++;
    }
  return 0;
}
Nullpointer
  • 1,086
  • 7
  • 20
  • 1
    I wouldn't call the identity function a hash function, so the name is a bit misleading :) And you should really use `RANGE = 1003` – Niklas B. Jun 15 '14 at 19:30
  • Note that this solution requires at least 1000000001 operations for zeroing out `hash`, which is about 1000 times more than the original `1000*10001`, and cannot cope with numbers greater than 1000000001. – n. m. could be an AI Jun 15 '14 at 19:37
  • @NiklasB. If I use RANGE = 1003 then I will have to limit range of my values to 1003 but OP says no such value range is defined this is why I have used RANGE = 1000000001. And yes hash function is a bit misleading! :p – Nullpointer Jun 16 '14 at 02:17
  • Well in that case you *must* use a hash table ;) – Niklas B. Jun 16 '14 at 06:13
1

I think the most efficient solution is to use hash set:

from sets import Set
s=Set()
for x in [1,2,3,4,5,2,3,1]:
  if x in s:
    print x
    break
  s.add(x)
Aleksei Shestakov
  • 2,508
  • 2
  • 13
  • 14
0

If your values are numbers, you can use radix sort to fill up a buffer and check for an element that appeared twice.

Rliger
  • 79
  • 4
0

Your algortihm isn't bad at all ! In the worst case you loop n*(n-1)/2, meaning a complexity of O(n²).

The most favourable condition would be a sorted array. THen you could just loop through it comparing each element with its predecessor. The worst is n-1 comparisons, otherwhise said a complexity of O(n).

However, I assume that the array is not sorted. Sorting it would imply the cost of the sort. Quiksort algorithm, which is pretty good here, has a worstcase of O(n²). So sorting+traversing would have a cost comparable to your algorithm.

Using a hash... well, it's optimal if memory is not a problem (see exellent solution from @Nullpointer. The algorithm cost is the simple traversal, which is O(n).

However in real life, you risk to have memory constraints, meaning shorter hash table and a hash function with risks of colisions (for example modulo size of table). For this reason you'll need to store for each hash value, the list of matching values. In such a situation, the worstcase is when all numbers have the same hash H. In this case, you would calculate each hash (simple O(n) traversal), but when inserting the hash, you'd need to loop through the colision list. A quick calculation shows that again you'd have n*(n-1)/2 comparison, and again a compelxity O(n²), the same as your original proposal.

Christophe
  • 68,716
  • 7
  • 72
  • 138