Find First Unique Element

Question

I had this question in interview which I couldn't answer. You have to find first unique element(integer) in the array. For example:

3,2,1,4,4,5,6,6,7,3,2,3

Then unique elements are 1, 5, 7 and first unique of 1.

The Solution required:

O(n) Time Complexity.

O(1) Space Complexity.

I tried saying:

Using Hashmaps, Bitvector...but none of them had space complexity O(1).

Can anyone tell me solution with space O(1)?

if elements appear together then you just need to test if next element equals current, if true you have your result... — Aleksandar Toplek, Feb 23 '13 at 22:02
@Jan Dvorak if you go from 0 to n, then test next, why would you test prev? — Aleksandar Toplek, Feb 23 '13 at 22:03
No....@smk The elements may not appear together....@Jan The elements can be positive/negative/Zero.... — Anup Buchke, Feb 23 '13 at 22:03
@AleksandarToplek if you don't test prev, you might confuse the last of a set as unique — John Dvorak, Feb 23 '13 at 22:05
If neither is true, Im gonna stick my head out and say not possible with O(n) time and O(1) space — smk, Feb 23 '13 at 22:06
@smk Can you bring a proof? Otherwise, I dispute your claim. — John Dvorak, Feb 23 '13 at 22:07
I have edited the question which may answer your questions. 1. The elements do not appear together. 2. The numbers can be positive/negative/zero. — Anup Buchke, Feb 23 '13 at 22:08
I'd agree with @smk that it can not be done with O(n) time and O(1) space. — pochen, Feb 23 '13 at 22:14
What if you can modify the input array? Then up to O(n) space but with weird constraints :) — Andrew Mao, Feb 23 '13 at 22:17
Note that any method that involves sorting won't necessarily work because he's looking for the first unique item. That wouldn't be a problem in his example, but if you swapped positions of `1` and `7`, then sorting the array would incorrectly give you `1` as the first unique item. — Jim Mischel, Feb 23 '13 at 22:17
@JimMischel: Any method that involves sorting won't work because there is no sorting in O(n), period. — Jon, Feb 23 '13 at 22:18
I'm starting to think this is impossible, even if you can shuffle the input — John Dvorak, Feb 23 '13 at 22:24
@Jon: clarification: there is no COMPARISON sorting in O(n). Re: radix sort and counting sort. I hypothesize that if there's a way to do this, it will involve going through the array backward, and modifying it. — Andrew Mao, Feb 23 '13 at 22:24
@AndrewMao: Well, technically both of those are not O(n) either unless you consider the integers in the list bounded. But if you do, *technically* you can allocate an array with at least one element for each possible integer in the list (which makes solving the problem trivial) and that would be O(1) space. So IMO while in practice the integers will be bounded, when answering the question you have to assume that they are not because otherwise it's cheating. — Jon, Feb 23 '13 at 22:30
I think OP probably forgot some of the conditions on the original problem. This would have been solved by now if it were possible. Just thinking about a proof of impossibility now... — Andrew Mao, Feb 23 '13 at 22:31
I'm thinking in terms of the number of comparisons needed to prove some solution is correct. If this is superlinear, there is no way to solve the original problem in linear time — John Dvorak, Feb 23 '13 at 22:32
I believe you always need only `N-1` comparisons to prove a solution is correct — John Dvorak, Feb 23 '13 at 22:34
If I got this right... O(1) means that you occupy constant amount of space? — Aleksandar Toplek, Feb 23 '13 at 22:40
@AleksandarToplek constant amount of space, but the size of the input is not counted — John Dvorak, Feb 23 '13 at 22:41
@Jan Dvorak So you say that the number of elements can be >2^32... that's not good! — Aleksandar Toplek, Feb 23 '13 at 22:45
If the values in the array are unbounded and you can't do something like a bucket sort in O(n), I'll agree with the people who say it's not possible in O(n) time and O(1) space. You'll need some sort to keep track of which values you've already scanned, and using constant space that can only be done by using an order on the elements, so that means sorting. — G. Bach, Feb 23 '13 at 22:50
@JanDvorak You only need N-1 comparisons to prove that a number given as a solution is unique within the array. How do you prove that it is the *first* unique number? — beaker, Feb 24 '13 at 00:13
http://stackoverflow.com/questions/2285533/find-the-first-un-repeated-character-in-a-string — Mikhail, Feb 24 '13 at 00:39

user1952500 · Accepted Answer · 2013-02-24T01:00:39.170

Here's a non-rigorous proof that it isn't possible: It is well known that duplicate detection cannot be better than O(n * log n) when you use O(1) space. Suppose that the current problem is solvable in O(n) time and O(1) memory. If we get the index 'k' of the first non-repeating number as anything other than 0, we know that k-1 is a repeated and hence with one more sweep through the array we can get its duplicate making duplicate detection a O(n) exercise.

Again it is not rigorous and we can get into a worst case analysis where k is always 0. But it helps you think and convince the interviewer that it isn't likely to be possible.

http://en.wikipedia.org/wiki/Element_distinctness_problem says: Elements that occur more than n/k times in a multiset of size n may be found in time O(n log k). Here k = n since we want elements that appear more than once.

+1, nice - could you add a reference for the impossibility of `O(n) time, O(1) space` duplicate detection? — us2012, Feb 24 '13 at 00:50
I can't find a link to the original paper (had studied it in a grad course). There's interesting information in http://en.wikipedia.org/wiki/Element_distinctness_problem. — user1952500, Feb 24 '13 at 00:59

score 0 · Answer 2 · answered Feb 23 '13 at 22:48

0

I think that this is impossible. This isn't a proof, but evidence for a conjecture. My reasoning is as follows...

First, you said that there is no bound on value of the elements (that they can be negative, 0, or positive). Second, there is only O(1) space, so we can't store more than a fixed number of values. Hence, this implies that we would have to solve this using only comparisons. Moreover, we can't sort or otherwise swap values in the array because we would lose the original ordering of unique values (and we can't store the original ordering).

Consider an array where all the integers are unique:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

In order to return the correct output 1 on this array, without reordering the array, we would need to compare each element to all the other elements, to ensure that it is unique, and do this in reverse order, so we can check the first unique element last. This would require O(n^2) comparisons with O(1) space.

I'll delete this answer if anyone finds a solution, and I welcome any pointers on making this into a more rigorous proof.

answered Feb 23 '13 at 22:48

Andrew Mao

35,740
23
143
224

"we can't ... swap values in the array because we would lose the original ordering of unique values." -- I believe this is disputable. – John Dvorak Feb 23 '13 at 22:51
1

You only need `O(n)` comparisons to show the first element is unique – John Dvorak Feb 23 '13 at 22:51
Yeah but you may need O(n) times O(n) comparisons before that to show the O(n) values before it are not unique. – G. Bach Feb 23 '13 at 22:52
@G.Bach `O(n^2)` comparisons trivially suffice. You still need a proof they're neccessary for any algorithm – John Dvorak Feb 23 '13 at 22:53
You do need at least `n` comparisons because you might find the only duplicate for the first element at the last position you look – John Dvorak Feb 23 '13 at 22:54
3

Without fiddling around with the order in which the values are stored in the array and without storing O(n) pieces of information on the worst case O(n) different values in the array, the bound of O(n^2) comparisons is tight; just take a set S of pairwise different values, put them in the first floor(n/2) places of an odd-numbered array, then a unique value k at position floor(n/2)+1, and fill the rest of the array with a random permutation of S; takes O(n^2) comparisons to see that every value in S is duplicate in the array. – G. Bach Feb 23 '13 at 23:05
So there's a couple of things that come to mind: some sort of histogram/hash structure/whatever to store the information regarding which values have already been read -> takes O(n) space worst case for O(n) different values. Sort the array beforehand -> takes Omega(n log n) for unbounded values and also distorts the order in which unique values occurred in the original array. Can't come up with anything else. – G. Bach Feb 23 '13 at 23:08

Jim Mischel · Answer 3 · 2013-02-24T00:09:58.437

0

Note: This can't work in the general case. See the reasoning below.

Original idea

Perhaps there is a solution in O(n) time and O(1) extra space.

It is possible to build a heap in O(n) time. See Building a Heap.

So you built the heap backwards, starting at the last element in the array and making that last position the root. When building the heap, keep track of the most recent item that was not a duplicate.

This assumes that when inserting an item in the heap, you will encounter any identical item that already exist in the heap. I don't know if I can prove that . . .

Assuming the above is true, then when you're done building the heap, you know which item was the first non-duplicated item.

Why it won't work

The algorithm to build a heap in place starts at the midpoint of the array and assumes that all of the nodes beyond that point are leaf nodes. It then works backward (towards item 0), sifting items into the heap. The algorithm doesn't examine the last n/2 items in any particular order, and the order changes as items are sifted into the heap.

As a result, the best we could do (and even then I'm not sure we could do it reliably) is find the first non-duplicated item only if it occurs in the first half of the array.

edited Feb 24 '13 at 00:09

answered Feb 23 '13 at 23:19

Jim Mischel

131,090
20
188
351

"This assumes that when inserting an item in the heap, you will encounter any identical item that already exist in the heap." -- I don't think that's true – John Dvorak Feb 23 '13 at 23:25
I'll be impressed to see a proof – John Dvorak Feb 23 '13 at 23:27
Why would the heap only require O(1) space? Assuming there are O(n) pairwise different values in the array, I don't see how that would work. – G. Bach Feb 23 '13 at 23:27
@G.Bach I assume the heap overwrites the input. Swapping elements and such. – John Dvorak Feb 23 '13 at 23:28
@G.Bach: Yes, the heap overwrites the input. It's just a reordering of the items in the array. See the linked article. – Jim Mischel Feb 23 '13 at 23:33
`This assumes that when inserting an item in the heap, you will encounter any identical item that already exist in the heap.` That one should be wrong, at least for any heap construction that is cheaper than `O(n^2)`: If nearly all elements are the same and you want to encounter all identical elements on every insertion, you need `O(n^2)` such encounters. – us2012 Feb 23 '13 at 23:55
@us2012: My wording was unfortunate there. I should have said, "an identical item if one exists in the heap." In any case, see my update for why this can't work in the general case. – Jim Mischel Feb 24 '13 at 00:11
@JanDvorak: See my update for why this can't work in the general case. – Jim Mischel Feb 24 '13 at 00:11
I used to believe that I have a quite good understanding of algorithmic complexity but thinking about this problem suddenly made me struggle. It seems to me that it is not even possible to keep track of the minimum given the `O(1)` space constraint. You can not store the minimum because an unbound integer does not fit into `O(1)` space. Storing the index of the minimum does not work either because it requires `log(n)` space. This is no problem for an actual implementation where `n` is surly bound by something like `2^64` but in the theoretical case? Where am I going wrong? – Daniel Brückner Feb 24 '13 at 01:10
@DanielBrückner: I've been assuming that "unbounded integer" in this case means "integer of some known, fixed size." I think that's the common usage, and this isn't a scholarly forum. – Jim Mischel Feb 24 '13 at 01:28
But if the size of the integers is bounded the solution is trivial (probably with a ridiculously large constant for the required space). If the bound is `m` just allocate enough space to keep one of three states - `never seen`, `seen once`, `seen more than once` - for each integer and iterate over the input twice. In the first pass just update the state for each number in the input, in the second pass output the first number with state `seen once`. – Daniel Brückner Feb 24 '13 at 01:45
@DanielBrückner: You're right: given infinite RAM, there is a trivial solution. Let's limit this to the real world, shall we? You have an array of 100,000,000 64-bit integers. Your computer has 1 gigabyte of RAM and no external storage (the operating system is booted from ROM). Solve this problem in O(n) time with O(1) extra space. – Jim Mischel Feb 24 '13 at 03:40
2

@jim-mischel suppose the (min) heap has 1 as root and 2 as left child, and nothing more. Insert 2. You'll insert it at the end of the heap, then you'll try to sift it up, but it won't move since it's already at the right place. No clue that his brother is also a 2. – naitoon Feb 24 '13 at 03:42
@naitoon: That's not quite how the heap building algorithm works, but your observation is essentially true nonetheless. Given the array [2,2,1], the algorithm will build the heap [1,2,2] without comparing the two values of `2`. – Jim Mischel Feb 24 '13 at 04:03

score 0 · Answer 4 · answered Jun 27 '17 at 01:47

OP's question original doesn't mention the limit of the number(although latter add number can be negative/positive/zero). Here I assume one more condition:

The number in array are all smaller than array length and non-negative.

Then, giving a O(n) time, O(1) space solution is possible and seems like a interview question, and the the test case OP gives in the question comply to above assumption.

Solution:

for (int i = 0; i < nums.length; i++) {
  if (nums[i] != i) {
    if (nums[i] == -1) continue;
      if (nums[nums[i]] == nums[i]) {
        nums[nums[i]] = -1;
      } else {
        swap(nums, nums[i], i);
        i--;
      }
    }
  }
}

for (int i = 0; i < nums.length; i++) {
  if (nums[i] == i) {
    return i;
  }
}

The algorithm here is considering the original array as bucket in bucket sort. Put numbers into its bucket, if more than twice, mark it as -1. Using another loop to find the first number that has nums[i] == i

Find First Unique Element

4 Answers4