5

The minimum unique number in an array is defined as min{v|v occurs only once in the array} For example, the minimum unique number of {1, 4, 1, 2, 3} is 2. Is there any way better than sorting?

walrii
  • 3,472
  • 2
  • 28
  • 47
shilk
  • 589
  • 5
  • 17

3 Answers3

5

I believe this is an O(N) solution in both time and space:

HashSet seenOnce;     // sufficiently large that access is O(1)
HashSet seenMultiple; // sufficiently large that access is O(1)

for each in input // O(N)
    if item in seenMultiple
        next
    if item in seenOnce
        remove item from seenOnce
        add to item seenMultiple
    else
        add to item seeOnce

smallest = SENTINEL
for each in seenOnce // worst case, O(N)
    if item < smallest
        smallest = item

If you have a limited range of integral values, you can replace the HashSets with BitArrays indexed by the value.

walrii
  • 3,472
  • 2
  • 28
  • 47
  • 1
    This is all great sir, but where can I find this magical O(1) random access hash function you speak of? As far as I (and the rest of the world) know the "best" random access time for hash function is O(logn). – ElKamina Aug 14 '12 at 03:39
  • Unless I'm missing something, no random access is needed. The first pass loops over `input` (O(N)) and does lookups, insertions, and deletions on the hash sets, each O(1). Total: O(N). The second pass as written currently loops over `seenOnce`, but that's easily fixed: simply loop over `input` again (O(N)), test membership in `seenOnce` (O(1)), and if it's there append it to a new `seenOnceList` (O(1)), total O(N). Finally do a scan for the minimum, O(N). Net O(N), no? – DSM Aug 14 '12 at 04:10
  • @ElKamina - You are confusing a tree with a hash. See http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions and look at the list under O(1). With a large table and an appropriate hash function, collisions are insignificant. – walrii Aug 14 '12 at 05:47
  • @DSM - my loop over seenOnce is what you called "a scan for the minimum". What is the purpose of the extra loop you added? – walrii Aug 14 '12 at 05:51
  • @DSM - The advantage of the array optimization is that the hash function = I() and no collision handling is needed. No Big-O advantage, granted, but simpler and faster. – walrii Aug 14 '12 at 05:57
  • It is not O(N) algorithm. You assume searching an integer in a set with size N can be done in O(1), which is wrong. With some "evil" input, your algorithm run-time complexity will be O(N^2). So your algorithm is in worst-case O(N^2). – barak1412 Aug 14 '12 at 11:46
  • @barak1412 - As documented, given sufficiently large hash tables and an appropiate hash function, the _expected_ complexity is O(N). If you violate these premises, then yes, you can venture into _worst-case_ complexities. That is also why the array optimization was mentioned, it removes that worst case. – walrii Aug 14 '12 at 12:25
  • walrii hash can't do any magic for you, it's not O(1) even in theory. – Saeed Amiri Aug 14 '12 at 13:32
  • @barak1412 - Sorry, you are wrong. A hash table's EXPECTED or AVERAGE insertion, access, and delete time is O(1). See http://en.cppreference.com/w/cpp/container/unordered_set, http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions and others. But comments are not the place to dispute it. I suggest you go vote and argue on http://stackoverflow.com/questions/222658/multiset-map-and-hash-map-complexity – walrii Aug 15 '12 at 03:04
  • @walrii EXPECTED != worst-time. So instead of "I believe this is an O(N) solution" you should say EXPECTED O(N) time. For some applications, O(nlogn) worst-case is better than O(n) expected with O(n^2) worst-case. – barak1412 Aug 15 '12 at 15:48
0

You don't need to do full sorting. Perform bubble sort inner loop until you get distinct minimum value at one end. In the best case this will have time complexity O(k * n) where k = number of non-distinct minimum values. However worst case complexity is O(n*n). So, this can be efficient when expected value of k << n.

I think this would be the minimum possible time complexity unless you can adapt any O(n * logn) sorting algorithms to the above task.

PermanentGuest
  • 5,213
  • 2
  • 27
  • 36
0

Python version using dictionary.
Time complexity O(n) and space complexity O(n):

from collections import defaultdict
d=defaultdict(int)
for _ in range(int(input())):
    ele=int(input())
    d[ele]+=1
m=9999999
for i in d:
    if d[i]==1 and i<m:
        m=i
print(m if m!= 9999999 else -1)

Please tell me if there is a better approach.

סטנלי גרונן
  • 2,917
  • 23
  • 46
  • 68