1

I am trying to solve a challenge, I wrote my solution and it passes all test cases except some hidden test cases. I can't think another case in which my method fails and don't know what to do anymore. Here it is:

int firstDuplicate(int[] a) {

int[]   indexCount;
int     duplicate, temp;
boolean check;

duplicate  = -1; temp = a.length;
indexCount = new int[a.length];
check      = false;

for( int i = 0; i < a.length; i++ ){

    if( indexCount[a[i]-1] == 0 ){
        indexCount[a[i]-1] = i+1;
        check = false;
    }else{
        indexCount[a[i]-1] = (i+1) - indexCount[a[i]-1];
        check = true;
    }
    if( check && indexCount[a[i]-1] < temp ){
        duplicate = a[i];
        temp      = indexCount[a[i]-1];
    }
}
return duplicate;

}

Instructions are:

Write a solution with O(n) time complexity and O(1) additional space complexity. Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index.

Example

For a = [2, 3, 3, 1, 5, 2], the output should be firstDuplicate(a) = 3.

There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than than second occurrence of 2 does, so the answer is 3.

For a = [2, 4, 3, 5, 1], the output should be firstDuplicate(a) = -1.

  • 3
    Right off the bat I'd presume that the O(1) space complexity has been failed. You're essentially using another array here to store information about where the duplicate is, which results in an additional O(n) storage. – Makoto Mar 25 '18 at 00:17
  • What's the minus part of `indexCount[a[i]-1] = (i+1) - indexCount[a[i]-1];`? That's the index of the first occurrence? It looks like you're looking for the duplicate that's closest to its original value, not the one that appears first? – Rup Mar 25 '18 at 00:26
  • Important question: can you modify the original array? – Mad Physicist Mar 25 '18 at 03:31
  • Possible duplicate of [CodeFight firstDuplicate interview challenge](https://stackoverflow.com/questions/45647307/codefight-firstduplicate-interview-challenge) – Mad Physicist Mar 25 '18 at 22:17

2 Answers2

3

Here is what I have. Runs in O(n) and uses O(1) space. Correct me if I'm wrong here.

Since my input cannot have a value that's more than the length, I can use mod operator for indexing on the same array and add the length to the value in index. As soon as I encounter a value that larger than the length, that means I've already incremented that before, which gives me the duplicate value.

public int firstDuplicate(int[] arr) {
    int length = arr.length;

    for (int i = 0; i < length; i++) {
        int expectedIndex = arr[i] % length;
        if (arr[expectedIndex] > length) {
            return arr[i] > length ? arr[i] - length : arr[i];
        } else {
            arr[expectedIndex] += length;
        }
    }

    return -1;
}
Mehmet
  • 131
  • 4
  • Some prose would be nice. Please describe in words how your solution works. – Mad Physicist Mar 25 '18 at 03:13
  • 1
    Clever solution tho, using the fact that you never get an element > length in the input – Mad Physicist Mar 25 '18 at 03:21
  • True, that's what was provided in the instructions above. Definitely can be updated with an initial linear run to validate the input. – Mehmet Mar 25 '18 at 03:25
  • There are three mistakes though. #1 You appear to be returning the first duplicate you find, not the one with the smallest second index. #2 In your else, you don't check if you have a duplicate already at that index. #3 Indices in the arrays are one-based, but zero based in Java. – Mad Physicist Mar 25 '18 at 03:26
  • You can fix #1, by starting from the end and keeping track of your last duplicate until the loop completes instead of returning immediately. Remember to skip already incremented elements. – Mad Physicist Mar 25 '18 at 03:28
  • #2 and #3 are trivial to fix. For an extra bonus, you can do another pass and reinstate the original values everywhere and still stay O(n) time-wise. – Mad Physicist Mar 25 '18 at 03:30
  • To illustrate #2, input `[2, 2]` (and pretend #3 has been fixed, but not #1). It should immediately return the first index, but it will return -1 instead. Needs `if(... || arr[expectedIndex] == expectedIndex + 1)`. Sorry for rambling. I haven't run any of this code, so this is just helping me think. – Mad Physicist Mar 25 '18 at 03:42
  • @MadPhysicist apologies if I'm missing something, but #1: if we're scanning the array from low to high index, won't the first duplicate we find automatically be the one with the lowest second index? (I think I like your approach of negating an element better than += length though this might hit problems if length > half max int, but that'll never happen with 32-bit ints) – Rup Mar 25 '18 at 10:26
  • @Rup. OP has a perfect counter-example in the question. That's why you need two flags: one to mark the first occurrence and one to mark the second. – Mad Physicist Mar 25 '18 at 15:54
  • @MadPhysicist You mean [2, 3, 3, 1, 5, 2] answer=3? Scanning left to right: 2 - not seen before, += length. 3 - not seen before, += length. 3 - have seen before, so this is must be the second index of first duplicate, so we can stop. – Rup Mar 25 '18 at 16:26
  • @Rup. I think you're confusing the point of the array index vs the value. – Mad Physicist Mar 25 '18 at 18:54
  • @MadPhysicist OK, that was poorly phrased, but I'm not. The second time we encounter a 3 is the first duplicate, and because it's the first duplicate we've found the index of this second occurrence of a 3 will be the duplicate with the lowest index. FWIW here's another answer to the same question that agrees with me: https://stackoverflow.com/a/45650778/243245 I tried to sign up to CodeFights to try the challenge there too but looks like I have to qualify for it first. – Rup Mar 25 '18 at 19:42
  • @Rup. If you post a snippet on IDEOne, we can test it out. – Mad Physicist Mar 25 '18 at 22:07
  • @Rup. I'm beginning to see the point now. I think you're right. No second pass needed. Only thing that's missing here is an adjustment of the index. – Mad Physicist Mar 25 '18 at 23:55
1

This answer is based on @Mehmet-Y's answer and all credit goes to Mehmet-Y. This version addresses the three issues I pointed out in the comments. I will delete this answer if the original gets corrected.

The general approach is to use the original array for storage instead of allocating a new one. The fact that no value may be less than one or greater than the length suggests that you can use the array as a set of indices to flag an element as "already seen" by either negating it or adding/subtracting the array length to/from it.

To achieve O(n) time complexity, you have to solve the problem in a fixed number of passes (not necessarily one pass: the number just can't depend on the size of the array).

But how do you decide which duplicate has the smallest second index? I would suggest using two different flags to indicate an index that is already seen vs. the second item in a duplicate pair. For this example, we can set the index flag by incrementing the elements by the length, and marking duplicates by negating them. You will need a second pass to find the first negagive in the array. You can also use that pass to restore the elements to their original values without sacrificing O(n) time complexity.

Here is a sample implementation:

int firstDuplicate(int[] a)
{
    // assume all elements of a are in range [1, a.length]
    // An assertion of that would not increase the time complexity from O(n)
    int len = a.length;
    for(int i = 0; i < len; i++) {
        // a[i] may be > len, but not negative.
        // Index of bin to check if this element is already seen.
        flagIndex = (a[i] - 1) % len;
        if(a[flagIndex] > len) {
            // If already seen, current element is the second of the pair.
            // It doesn't matter if we flag the third duplicate,
            // just as long as we don't tag the first be accident.
            a[i] = -a[i];
        } else {
            // Flag the element as "already seen".
            // This can be done outside the else, but you might run
            // into (more) overflow problems with large arrays.
            a[flagIndex] += len;
        }
    }
    // Search and stash index of first negative number
    for(int i = 0; i < len; i++) {
        if(a[i] < 0) {
            return -a[i] % len;
        }
    }
    // Nothing found, oh well
    return -1;
}

If you want to take advantage of the second pass to restore the original values of the array, replace

for(int i = 0; i < len; i++) {
    if(a[i] < 0) {
        return -a[i] % len;
    }
}
return -1;

with

int duplicate = -1;
for(int i = 0; i < len; i++) {
    if(a[i] < 0) {
        a[i] = -a[i];
        if(duplicate == -1) {
            duplicate = a[i] % len;
        }
    }
    a[i] %= len;
}
return duplicate;
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264