3

Classic problem

Given an unsorted array A of integers, find the smallest positive integer which doesn't exist in A.

[3, 2, 1, 6, 5] -> 4
[2, 3, 4, 5] -> 1
[1, 2, 3, 4] -> 5

In this classic problem you are given just one array, and you can find a lot of solutions and resources about this problem, including stackoverflow, Code Review@SE and geeksforgeeks.org.

Hard edition

In my problem you are given two arrays, A and B, both of length N. Construct an array C of length N, whose smallest missing integer is maximum possible. C[i] must be either A[i] or B[i].

A = [1, 2, 4, 3],
B = [1, 3, 2, 3] =->
C = [1, 2, 4, 3] answer = 5

A = [4, 2, 1, 6, 5],
B = [3, 2, 1, 7, 7] =->
C = [3, 2, 1, 6, 5] answer = 4

A = [2, 3],
B = [4, 5] =->
C = [2, 5] answer = 1

How can we solve this problem with O(N) worst-case time and space complexity?

Assume that:

1<= N <= 100,000
1<= A[i], B[i] <= 100,000,000
greybeard
  • 2,249
  • 8
  • 30
  • 66
Eziz Durdyyev
  • 1,110
  • 2
  • 16
  • 34

4 Answers4

2

This algorithm works in O(n) arithmetic operations on numbers in the number range [0..n+2], plus the time for processing input.


  • Replace all numbers not in range [1..n] with n+2. This will not change the result.
  • Construct a graph with the nodes labeled 1, 2, 3, ..., n-1, n, n+2, and for all 0 <= i < n, there exists an edge between node A[i] and B[i] (assume 0-indexing array). So the graph has n+1 nodes and n edges.

Now the problem is equivalent to finding a way to direct the edges such that the minimum vertex with in-degree 0 is maximum possible.


  • Find the connected components in the graph.
  • For each connected components with v vertices and e edges, e >= v-1:

    • If e == v-1, it's a tree. All ways to direct the edges will result in a vertex having in-degree 0, and it can be proven that for all vertex in the tree, there exists a way to direct the edges such that it's the only vertex having in-degree 0.

      Method:

      Root the tree at that vertex, and then make every edges direct from the parent to the child.

      Of course, it's optimal to (greedily) choose the vertex having in-degree 0 to be the vertex with highest number.

    • If e >= v, there exists a circuit inside the connected component. Then, it's possible to direct the edges such that all vertices have nonzero in-degree.

      The proof (and method to construct the edge direction) is left to the reader.

user202729
  • 3,358
  • 3
  • 25
  • 36
1

If you iterate over the arrays and build a graph, where each unique integer you encounter in A or B becomes a vertex, and each pair of integers A[i] and B[i] creates an edge between two vertices, then this graph will have at most 2×N vertices, so the space taken up by this graph is linear to N.

Then, for every edge in the graph, we will decide its direction, i.e. which of the two integers it connects will be used in array C. Finally, we will iterate over the arrays again, and for each pair of integers, we look at the corresponding edge in the graph to know which of the integers to use.

I believe the operations needed to decide the directions in the graph can all be done in linear time to N (correct me if I'm wrong), so the whole algorithm should have O(N) time and space complexity.

The rules for deciding the edge directions in the graph are as follows:

  • Unconnected parts of the graph are each treated seperately.
  • If a (sub-) graph has fewer edges than vertices (i.e. it has a tree-structure with no cycles), then the highest integer has to be sacrificed: point the edges away from that vertex, and propagate that direction over the rest of the vertices.
  • If a (sub-) graph has more edges than vertices (i.e. it contains at least one cycle, which could be two vertices connected by multiple edges), then find any cycle, and give its edges the same direction (either clockwise or anti-clockwise); then propagate the direction to the other vertices.

While iterating over the arrays and looking up the corresponding edges in the graph, mark which integer you've already used when two vertices are multi-connected, so that the second time you use the other integer. After that, you can pick either one.

Example:

A = [1,6,2,9,7,2,5,3,4,10]
B = [5,8,3,2,9,7,1,2,8,3]

Graph:

1===5   6---8---4   9---2===3---10
                    |   |
                    --7--

We find the cycles [1>5>1] and [9>2>7>9], and the highest integer 8 in the non-cyclic sub-graph.

Directed graph:

1<=>5   6<--8-->4   9-->2<=>3-->10
                    ^   |
                    |-7<-

Result:

C = [1,6,2,2,9,7,5,3,4,10]

The first missing integer is 8, because we had to sacrifice it to save the 4 and the 6 in the [6,8,4] subgraph.

0

Please check this code snippet, I have written in Java 8

import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;
import java.util.stream.IntStream;


public class Run {
    private int solution(int[] A) {

        List<Integer> positiveIntegerList = Arrays.stream(A).distinct().filter(e -> e > 0).boxed().sorted(Comparator.naturalOrder()).collect(Collectors.toList());
        final int i = 1;

        if (!positiveIntegerList.isEmpty()) {
            final int missingMax = positiveIntegerList.get(positiveIntegerList.size() - i) + i;
            List<Integer> range = IntStream.rangeClosed(i, positiveIntegerList.get(positiveIntegerList.size() - i))
                    .boxed().collect(Collectors.toList());

            AtomicInteger index = new AtomicInteger();
            return range.stream().filter(e -> !e.equals(positiveIntegerList.get(index.getAndIncrement()))).findFirst().
                    orElse(missingMax);
        } else {
            return i;
        }
    }


    public static void main(String[] args) {
        Run run = new Run();
        int[] intArray = new int[]{1, 3, 6, 4, 1, 2};
        System.out.println(run.solution(intArray));
    }
}
Ajinkyad
  • 37
  • 5
0

I have tried my way of solving this problem in Python3. Suggestions and feedback are welcome. Thanks.

  1. Remove negative values from the list
  2. Insert 0 to the list
  3. Sort and remove duplicates from the list using sorted and set function
  4. Create index, value using enumerator.
  5. Compare index and value starting from (0, 0). If index and value doesn't match, then the next value is the smallest positive number.

--

from typing import List

class Solution:
    def firstMissingPositive(self, nums: List[int]) -> int:
        #Remove all the negative values from the list
        positive_list = [i for i in nums if i > 0]

        #Appending Zero to list. In case of no postive values in the list.
        positive_list.append(0)

        #Sort and remove duplicates from the list using Sorted and set function
        sorted_postive_list = sorted(set(positive_list))

        #starting with (index, value) --> (0,0), Compare index and value, and if they don't match then next value should be the smallest positive number
        return next((a for a, b in enumerate(sorted_postive_list, sorted_postive_list[0]) if a != b), sorted_postive_list[-1]+1)

if __name__ == "__main__":
    num = [1,2,3,0]
    a = Solution()
    print(a.firstMissingPositive(num))
Brendan Abel
  • 35,343
  • 14
  • 88
  • 118
  • This does not solve any problem about *two* arrays. For the related classic problem, the solution looks slow and [undocumented](https://www.python.org/dev/peps/pep-0257/#what-is-a-docstring) but straight. – greybeard Sep 23 '22 at 04:47