2

I am trying out the following codility.com exercise to improve my skills online, I was presented with the following problem.

This is a demo task.

Write a function:

    class Solution { public int solution(int[] A); }

that, given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A.

For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.
Given A = [1, 2, 3], the function should return 4.
Given A = [-1, -3], the function should return 1.

Write an efficient algorithm for the following assumptions:
• N is an integer within the range [1..100,000);
• each element of array A is an integer within the range (-1,000,000..1,000,000).

Copyright 2009– by Codility Limited

rendered description

I solved it using the following solution:

<?php

class Solution {
    public function($A) {
        $posInts = [1, 2, 3, 4, 5, 6, 7, 8, 9];
        $diffs = array_diff($postInts, $A);
        $smallestPosInt = min($diffs);
        return $smallestPosInt;
    }
}

However upon submitting I got the following score: enter image description here

Now I am very unsure of what I did wrong here or how I can rewrite the code with a better algorithm.

greybeard
  • 2,249
  • 8
  • 30
  • 66
GolDRoger
  • 109
  • 3
  • 10
  • I might go after the missing "correctness" score first. Plus, you should simply ignore all negative numbers. You will have to touch each number in the array at least once so make the most of that. And why do you think the number will be from 1 to 9? – RufusVS Sep 06 '21 at 22:01
  • You're right about that, didn't think it through so that explains the missing correct score. However now I'm stumped because I can't figure out what algorithm to use to solver this problem. I'm quite new to data structures and algorithms. – GolDRoger Sep 06 '21 at 22:04
  • _"I solved it using the following solution"_ - It's also not valid PHP syntax. – M. Eriksson Sep 06 '21 at 22:06
  • @GolDRoger Could you add the task description as text rather than an image so it becomes searchable, please? – ArSeN Sep 06 '21 at 22:10
  • Thanks just realised, I couldn't copy my solution before submission so typed it from memory using the example but unfortunately made some errors, think I got all of them now, edited my post. – GolDRoger Sep 06 '21 at 22:11
  • Alright will do that. – GolDRoger Sep 06 '21 at 22:20
  • 1
    If you want to avoid space , just sort the input array, traverse and report the first missing one. – SomeDude Sep 06 '21 at 22:21
  • Ask yourself what a solution would look like that went through the array once and only once. Functions like in_array, min, max, array_diff are all looping functions. – ryantxr Sep 06 '21 at 22:38
  • Looks a duplicate of [Find the smallest positive integer that does not occur in a given sequence](https://stackoverflow.com/q/51719848/3789665). – greybeard Sep 23 '22 at 05:52

3 Answers3

2

Check out this answer using Javascript in a way that works with the best possible performance -If I am not mistaken- O(N).

function solution(A) {
  const set = new Set(A)
  let i = 1

  while (set.has(i)) {
    i++
  }
  return i
}
Moath Thawahreh
  • 2,519
  • 2
  • 14
  • 19
1

I would just loop over (increment) any possible integers:

function solution($A) {
    $result = 1;
    $maxNumber = max($A);
    for (; $result <= $maxNumber; $result++) {
        if (!in_array($result, $A)) {
            break;
        }
    }
    return $result;
}

var_dump(solution([1, 3, 6, 4, 1, 2])); // int(5)
var_dump(solution([1, 2, 3])); // int(4)
var_dump(solution([-1, -3])); // int(1)

// As a bonus, this also works for larger numbers:
var_dump(solution([1, 3, 6, 4, 1, 2, 7, 8, 9, 10, 11, 12, 13, 5, 15])); // int(14)

Edit regarding performance:

As pointed out in the comments (and you already said yourself), this is not a very efficient solution.

While I do not have enough time on my hands currently to do real performance testing, I think this should be close to an O(n) solution: (keeping in mind that I am not sure how arrays are implemented on the C-side of PHP)

function solution($A) {
    $result = 1;
    $maxNumber = max($A);
    $values = array_flip($A);
    for (; $result <= $maxNumber; $result++) {
        if (!isset($values[$result])) {
            break;
        }
    }
    return $result;
}
// Not posting the output again because it is naturally the same ;) 

The "trick" here is to flip the array first so that the values become the indexes. Since a) we do not care about the original indexes and b) we do not care if duplicated values overwrite each other, we can safely do that.

Using isset() instead of in_array() should be a lot quicker since it basically just checks if a variable (in this case stored at a specific index of the array) exists and PHP does therefore not have to iterate through the array in order to check whether or not each number we loop over exists within it.

P.S.: After thinking twice I think this may still be closer to O(n*2) because max() probably loops to find the highest value. You could also remove that line and just check against the highest number there is in PHP as an emergency exit, like so: for (; $result <= PHP_INT_MAX; $result++) { ... } as a further optimization. Or maybe just hard-code the highest allowed number as specified in the task.

ArSeN
  • 5,133
  • 3
  • 19
  • 26
  • Thank you soo much for your time, if I may ask, I have been led to believe that I need to avoid loops as much as possible seeing as they are "expensive" in the sense that they have a huge negative impact on performance. Is this accurate? – GolDRoger Sep 06 '21 at 22:12
  • @GolDRoger Yes that's right. I was going for correctness first ;) It depends on the use cases though. Do you really want to pass arrays with millions of integers or will it only handle a small-ish amount of numbers at once? Looping itself is not *that* expensive but it rather depends on what you do inside of the loop, and yeah, `in_array()` is probably not the most cost-effective way to solve this. – ArSeN Sep 06 '21 at 22:15
  • So the performance bit is to test my understanding of data structures and algorithms and how well I can write highly performative code. If possible can you please direct me as to how to rewrite the code in favour of performance? – GolDRoger Sep 06 '21 at 22:20
  • If I were doing in python, I would start with 1 as the answer then start through the array, comparing to my current answer. BUT... as I check numbers in the array, I would also add them to a set, so when I bump the current answer, I can check against the already-passed set to make sure I haven't seen that number already. In fact, in python I'd just put the whole array in a set, then start counting from 1 until I found a number not in the set. PHP may be different though. (In C I'd do a bitmap array) – RufusVS Sep 06 '21 at 23:04
  • This is O(n^2), it can be easily done in O(nlogn) by sorting the array first. – maraca Sep 07 '21 at 00:14
  • @maraca we can also do it in O(n). – גלעד ברקן Sep 07 '21 at 12:03
  • @גלעדברקן Yes, I just wanted to point out that is possible without much effort to be better than O(n^2). It's almost finding the mex, but not quite. – maraca Sep 07 '21 at 17:30
  • @GolDRoger Edited with (I think) better solution ;) Does this help? @ maraca / גלעד ברקן would you agree on the "close to O(n)" part? Could also be O(n * 2) because of the `max()` call. – ArSeN Sep 07 '21 at 21:14
  • Thank you soo much for your help, have really learned a lot from this. :D – GolDRoger Sep 09 '21 at 10:37
1

If we're allowed to modify the input, perform this in place, otherwise create a new array of size n + 1:

For each element encountered in the original array, if it is greater than n + 1 or smaller than 1, assign 0 at the element's index (index - 1 if performing in place); otherwise assign 1 at the index of the array the value is and assign 0 at its own index if it is different. After that run a second traversal and report the first index (index + 1 if performing in place) greater than zero with value 0, or n + 1.

[1, 3, 6, 4, 1, 2]

=>

[1, 1, 1, 1, 0, 1]

report 5
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
  • (I guess you need to spell out *encountered* - iterating the array doesn't promise to work.) – greybeard Sep 23 '22 at 05:57
  • @greybeard could you please explain what you mean? Each element of an array can be understood as having a value and an index. The answer doesn't need to specify the method of "encountering" these. – גלעד ברקן Sep 24 '22 at 01:36
  • I start at assumed 1st index 1, assign 1 to element 2. Next, I look at index 2, assign 1 to element 3 (? `at the index of the array the value is`) and 0 to element 3. Is the former value of 6 *encountered*? If I save it and use it as the next element/index to inspect, how do I continue to have all of them processed before the *second traversal*? – greybeard Sep 24 '22 at 07:58