9

Given an array which might contain duplicates, How can we find if it is a sequence?

Eg. {7, 3, 5, 4, 6, 2}

is a sequence 2, 3, 4, 5, 6, 7

Sorting is an obvious solution. How can we do this in O(n) time and O(1) space?

Jayram
  • 18,820
  • 6
  • 51
  • 68

3 Answers3

10

assuming 1,2,3,3,4,1 is a valid unsorted sequence and 2,4,6,8 is a valid sequence (of step two) as well, but 1,3,5,9 isn't (7 is missing) and assuming the input array can be overwritten,

  1. determine the maximum and minimum: O(n) time, O(1) space. You can use the first and the last position of the array for this.
  2. determine the step. The step is the least common multiplier of all a_n - min
  3. if they are too far apart (max - min > (count + 1) * step), this cannot be a sequence. Otherwise,
  4. do an in-place integer sort. Until start > end:
    • look at the first position. Let the value there be v_0
    • let its target position when we assume no duplicates ((v_0 - min) / step + start) be i
      • if the target position is less than start, it's a duplicate. Move it to the back and decrement the end pointer
      • if the target position is more than end, some element is missing in the sequence. Claim the array is not a sequence.
    • if the element is at the target position, increment the start pointer and the min reference
    • else if the element at the target position is less than the reference minimum or equal to v_0, swap it to the end of the array and decrement the end pointer. It's a duplicate.
    • else swap the element at the target position with v_0.
  5. Claim the array a sequence

The in-place integer sort is O(n). In each step it either:

  • shortens the input array and keeps all sorted elements at their target positions or
  • sorts one or two previously unsorted elements to their target position.

At the end of sorting, each element is either a duplicate in the duplicate block, or at its correct position in the sorted block.

Note that step #3 can be left out. #4 will correctly determine this is not a sequence, albeit slower.

If the step has to be 1, then this algorithm can be simplified somewhat (see revision #1)

John Dvorak
  • 26,799
  • 13
  • 69
  • 83
1

This algorithm (Python) destroys the original array, but otherwise satisfies O(n) time and O(1) extra space.

# INPUT: An array 'arr' of N integers.
# OUTPUT: If the array consists exactly of the integers
#         S, S+1, ..., S+N-1, for some S, in any order,
#         then modifies 'arr' into a sorted array and returns it.
#         Otherwise, returns False, and 'arr' may have been modified.
def sort_sequence (arr):
    the_min = min(arr)
    the_max = max(arr)
    if the_max - the_min != len(arr) - 1:
        return False
    for i in range(len(arr)):
        arr[i] -= the_min
    for i in range(len(arr)):
        while arr[i] != i:
            j = arr[i]
            t = arr[j]
            if t == j:
                return False
            arr[j] = j
            arr[i] = t
    for i in range(len(arr)):
        arr[i] += the_min
    return arr

I have not formally proven that it works yet.

Why is this O(n)? In the final double loop, after an element is first put into its correct spot, it can only be visited one more time - either at the beginning of another inner loop where it is seen to be in the right spot, or where it is found to be in the way of a duplicate element (the if t == h part).

Ambroz Bizjak
  • 7,809
  • 1
  • 38
  • 49
  • This looks similar to [mine](http://stackoverflow.com/a/18102484/499214) except you are assuming a step of 1 and you're disallowing duplicates. The latter is clearly against the spec. – John Dvorak Aug 07 '13 at 12:25
  • @JanDvorak well, the question is ambiguous, it just says the array might contain duplicates, and not how they are to be interpreted. In any case I stated in the comments exactly which problem my algorithm is supposed to solve. – Ambroz Bizjak Aug 07 '13 at 12:28
  • @JanDvorak similarly the question doesn't say anything about step size. I think if the intent was to allow any step size, it would have been written in terms of arithmetic progression. – Ambroz Bizjak Aug 07 '13 at 12:31
  • 1
    @AmbrozBizjak It's a recollected interview question, there's a lot of room for omission of or incorrect details there, and a small change for a more generic solution is probably better / more likely to land you that job. – Bernhard Barker Aug 07 '13 at 12:41
-1

Let my try:

  1. Determine min and max – O(n)
  2. Make an array of nullable values of the size of the original array – O(n) (Yes, missing space requirement here)
  3. Iterate over the original array (O(n)) and put the number you find at index (number - min), if you find a value there, you don't have a sequence, if you come to an end and no position was claimed, you have a sequence
public bool IsSequence(int[] values)
{
    var min = values[0];
    var max = values[0];
    foreach (var value in values)
    {
        min = min > value ? value : min;
        max = max > value ? max : value;
    }

    if ((max - min + 1) != values.Length)
        return false;

    var testingArray = new int?[values.Length];
    foreach (var value in values)
    {
        var index = value - min;
        if (testingArray[index].HasValue)
            return false;
        else
            testingArray[index] = value;
    }
    return true;
}
oddparity
  • 438
  • 5
  • 14