If you iterate over the arrays and build a graph, where each unique integer you encounter in A or B becomes a vertex, and each pair of integers A[i] and B[i] creates an edge between two vertices, then this graph will have at most 2×N vertices, so the space taken up by this graph is linear to N.
Then, for every edge in the graph, we will decide its direction, i.e. which of the two integers it connects will be used in array C. Finally, we will iterate over the arrays again, and for each pair of integers, we look at the corresponding edge in the graph to know which of the integers to use.
I believe the operations needed to decide the directions in the graph can all be done in linear time to N (correct me if I'm wrong), so the whole algorithm should have O(N) time and space complexity.
The rules for deciding the edge directions in the graph are as follows:
- Unconnected parts of the graph are each treated seperately.
- If a (sub-) graph has fewer edges than vertices (i.e. it has a tree-structure with no cycles), then the highest integer has to be sacrificed: point the edges away from that vertex, and propagate that direction over the rest of the vertices.
- If a (sub-) graph has more edges than vertices (i.e. it contains at least one cycle, which could be two vertices connected by multiple edges), then find any cycle, and give its edges the same direction (either clockwise or anti-clockwise); then propagate the direction to the other vertices.
While iterating over the arrays and looking up the corresponding edges in the graph, mark which integer you've already used when two vertices are multi-connected, so that the second time you use the other integer. After that, you can pick either one.
Example:
A = [1,6,2,9,7,2,5,3,4,10]
B = [5,8,3,2,9,7,1,2,8,3]
Graph:
1===5 6---8---4 9---2===3---10
| |
--7--
We find the cycles [1>5>1] and [9>2>7>9], and the highest integer 8 in the non-cyclic sub-graph.
Directed graph:
1<=>5 6<--8-->4 9-->2<=>3-->10
^ |
|-7<-
Result:
C = [1,6,2,2,9,7,5,3,4,10]
The first missing integer is 8, because we had to sacrifice it to save the 4 and the 6 in the [6,8,4] subgraph.