1

Possible Duplicate:
Find the Smallest Integer Not in a List

you have a stream of unsorted positive integers. After you read all the numbers from the stream you have to determine the smallest positive integer that is missing from the stream.

example: Stream of positive integers: 6 7 8 9 1 2

ans : 3

Stream of positive integers: 1 2 3 4 5

and : 6

Stream of positive integers: 12 87 899

ans : 1

I wanted to solve the problem without taking any extra data structure. Is it possible?

I am stuck on this problem. Did all the research i could on the internet, but, no luck. Could anyone help.

Community
  • 1
  • 1
xtrmcode
  • 61
  • 5
  • How are you reading the numbers from the stream? – Jordan Kaye Dec 20 '12 at 13:46
  • @Jordan Kaye: does it matter? it's a stream.. – Karoly Horvath Dec 20 '12 at 13:48
  • @KarolyHorvath Well, there's the possibility that the numbers are only available after the entire stream has been read, or that they are available individually; this would impact possible solutions. – Jordan Kaye Dec 20 '12 at 13:49
  • @Jordan Kaye: for me, stream means you cannot do random access. but let's see what the OP says. – Karoly Horvath Dec 20 '12 at 13:50
  • @KarolyHorvath Which was the purpose of my question - just to clarify :) – Jordan Kaye Dec 20 '12 at 13:50
  • @JordanKaye It's a stream that means they are available individually. – xtrmcode Dec 20 '12 at 13:52
  • @Oren - Not really a duplicate if truly needing to be done on a stream basis. Without storing and sorting, this is an impossible problem to solve, as there may be arbitrarily multiple states (areas / gaps on the number line) able to be jumped to from a current state. As #'s come in and there is separation between it and all other numbers, it creates another state. If a # comes in that fills in a state (gap in the # line) yes you will move to exactly the next gap, but where is that if you havent' been keeping track of all potential gaps along the way. – trumpetlicks Dec 20 '12 at 14:06
  • @trumpetlicks: Since storing is required as you say, this *is* a duplicate. – interjay Dec 20 '12 at 14:12
  • @interjay - LOL, suppose you're right :-) Unless of course I am incorrect and there is a way to do it on a stream basis. – trumpetlicks Dec 20 '12 at 14:13

2 Answers2

1

You can sort the array as you read it using insertion sort (seeing as how the data comes from a stream, this should be efficient) and then iterate through it. If 1 is missing, that's your answer. If it's there, you can iteratively check if the next integer is the next number in the array, otherwise the next integer is the missing one.

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625
  • i dont have an array. Its a stream of numbers that is coming one by one. Now i could create an array while reading and then sort it. But its not that efficient. – xtrmcode Dec 20 '12 at 13:50
  • @xtrmcode au contraire... If you use insertion sort, it should be actually really fast. This is even more efficient considering you have to read the whole stream until you can report the missing number. – Luchian Grigore Dec 20 '12 at 13:52
  • 1
    at the moment I don't see a more efficient solution. perhaps a bitmap, but only if the range of numbers is limited. – Karoly Horvath Dec 20 '12 at 13:52
  • @KarolyHorvath yes i thought about insertion sort. But the array could be longer, if say 1 to 1-million is present in the stream, in that case 1-million-1 would be the answer. but yes i agree sometimes i might don't have to read the entire stream. – xtrmcode Dec 20 '12 at 13:56
  • 1
    "but yes i agree sometimes i might don't have to read the entire stream." - nobody said that. you *have to* read the whole stream. always. – Karoly Horvath Dec 20 '12 at 13:59
  • 1
    @xtrmcode I agree, you always have to read the whole stream, unless you have constraints you didn't tell us about. – Luchian Grigore Dec 20 '12 at 14:00
  • @karloy @ luchian no sorry I am wrong. I was thinking wrong. yes for any case you have to read the entire stream. – xtrmcode Dec 20 '12 at 14:03
0

There exists a method in case following two constraints are satisfied. This algorithm would scan the stream twice and need 256KB memory.

  1. Each element is less than or equal to 0xFFFFFFFF.
  2. All elements in the stream are distinct.

Following shows this algorithm.

  1. Allocate an array: unsigned int bucket[2^16].
  2. Scan this streaming data. If current data is a, then bucket[a>>16]++;
  3. After scanning this stream, find the first bucket whose value is less than 2^16.The first missing unsigned integer is in this bucket. Let the index of this bucket to be k, and the first remaining unsigned integer to be n. Then k*(2^16) <= n < (k+1)*(2^16).
  4. Reset value of all buckets to be zero.
  5. Scan this streaming data again. If current data is a and k*(2^16) <= a < (k+1)*(2^16), then bucket[a&0xFFFF]++.
  6. After scanning this stream again, find the first bucket whose value is zero. Let the index of this bucket to be h.
  7. This answer n = k*(2^16) + h.

    Note: The bucket[0] in the third step should be at most 2^16 - 1 rather than 2^16 because the set of unsigned integers doesn't include 0.

      ˊ