-2

Can anyone could help me how to solve this code in C? I think that I have to use big O notation as a solution, but I have no idea about it.

The question: There is an array T sized N+1 where numbers from 1 to N are random. One number x is repeated twice (position is also random).

What should be the fastest way to find value of this number x?

For example: N = 7

[6 3 5 1 3 7 4 2]

x=3

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
Zero_One
  • 31
  • 2
  • 3

3 Answers3

4

The sum of numbers 1..N is N*(N+1)/2.

So, the extra number is:

extra_number = sum(all N+1 numbers) - N*(N+1)/2

Everything is O(1) except the sum. The sum can be computed in O(N) time.

The overall algorithm is O(N).

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82
  • 1
    This is a standard interview question. For extra credits you need to consider what happens when the sum overflows the integer type. – Klas Lindbäck Sep 16 '16 at 14:03
  • 2
    For fun, a 64-bit unsigned integer will work for N up to 6074000998 as long as you take steps to avoid overflow in the calculation of N*(N+1)/2 (by changing the calculation steps depending on whether N is odd or even). – Ian Abbott Sep 16 '16 at 15:24
0

The algorithm given by Klaus has O(1) memory requirements, but requires to sum all the elements from the given array, which may be quite large to iterate (sum) all over them.

Another approach is to iterate over array and increment the occurence counter once per iteration, so the algorithm can be stopped instantly once it finds the duplicate, though the worst case scenario is to scan through all the elements. For example:

#define N 8

int T[N] = {6, 3, 5, 1, 3, 7, 4, 2};

int occurences[N+1] = {0};
int duplicate = -1;
for (int i = 0; i < N; i++) {
    occurences[T[i]]++;
    if (occurences[T[i]] == 2) {
        duplicate = T[i];
        break;
    }
}

Note that this method is also immune to integer overflow, that is N*(N+1)/2. might be larger than integer data type can possibly hold.

Grzegorz Szpetkowski
  • 36,988
  • 6
  • 90
  • 137
  • Note: For an algorithm only the complexity is relevant. If performance is critical, you have to measure (Here, visiting half of the elements comes with costs: condition, extra data access) –  Sep 16 '16 at 16:31
  • @DieterLücking: True, in general the algorithm is O(n). One pleasant property is that it is immune to integer limits: `N*(N+1)/2` might overflow in some cases. – Grzegorz Szpetkowski Sep 16 '16 at 18:05
0

Walk the array using the value as the next array index (minus 1), marking the ones visited with a special value (like 0 or the negation). O(n)

On average, only half the elements are visited.

v
6 3 5 1 3 7 4 2

            v
. 3 5 1 3 7 4 2

        v   
. 3 5 1 3 7 . 2

      v
. 3 5 1 . 7 . 2

  v
. 3 5 . . 7 . 2

      v !! all ready visited.  Previous 3 is repeated.
. 3 5 . . 7 . 2

No overflow problem caused by adding up the sum. Of course the array needs to be modified (or a sibling bool array of flags is needed.)

This method works even if more than 1 value is repeated.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256