-4
['a', 'c','a','b','b'] -> ['a', 'c', 'b']

Searching online didn't help. I'm not sure whether this is even possible.

Preferably write solution in Python, but anything is fine.

=====

Edit: I originally said O(1) space, but I don't think it's possible.

Changing the requirements:

No space requirements, but you must modify the input in-place and return it.

onepiece
  • 3,279
  • 8
  • 44
  • 63
  • 9
    You're going to have to show us what you've tried. Stackoverflow isn't just a function that you put your problem into and it spits out an answer. – lintmouse Oct 08 '15 at 23:52
  • 2
    It's not possible. Doable with O(n) space. – Karoly Horvath Oct 08 '15 at 23:53
  • 1
    If the array/data is sorted I think it can be done. (Like the reduce step in a map reduce word count http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/) – viper Oct 09 '15 at 00:02
  • @viper yeah a reduce or accumulate call would work with sorted. – Aidan Gomez Oct 09 '15 at 00:03
  • Possible duplicate of [How do you remove duplicates from a list in Python whilst preserving order?](http://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-in-python-whilst-preserving-order) – Aidan Gomez Oct 09 '15 at 00:03
  • 2
    @AidanGomez: Links to related material are welcome, but this is clearly not a dup. – Karoly Horvath Oct 09 '15 at 00:04
  • what if you newly allocate it and then overwrite the memory of the original array(ala `a[:]=new_list`? or do you mean accomplish it __only__ by removal of elements from the array (ie del/pop) – Joran Beasley Oct 09 '15 at 00:09
  • @JoranBeasley. Regarding your two questions: 1) I wasn't clear enough, I mean you have to modify the input in-place and return it. Updated the question. 2) `del` from a list is an O(n) operation. I'm guessing you want to call it while linearly traversing the input, making it an O(n^2) algorithm. – onepiece Oct 09 '15 at 00:17

7 Answers7

3

How about this? I think the time complexity and the space complexity are both O(n).

def unique(arr):
    index = 0
    s = set()
    for i in range(len(arr)):
        if arr[i] not in s:
            arr[index] = arr[i]
            index += 1
            s.add(arr[i])
    arr[index:] = []
Paul Boddington
  • 37,127
  • 10
  • 65
  • 116
2

I don't think this is possible in O(n) time and O(1) space. To get O(n) time you have to have a scratchpad (with O(1) insert and lookup, e.g. a set) recording all the elements you've already seen; that requires O(m) space (where m is the number of unique elements). Conversely, to get O(1) space you have to rescan earlier elements in the array for each element, which is O(n2) time.

If you'll accept an O(m) scratchpad, this fulfills the "modify the array in place" requirement and runs in O(n) time.

def remove_duplicates(arr):
    """Remove all duplicated elements from ARR in O(m) space and O(n) time.
       ARR is modified in place and is not returned (like `sort`)."""
    n = len(arr)
    if n <= 1: return
    seen = set((arr[0],))
    i = 1
    j = 1
    while i < n:
        if arr[i] not in seen:
           seen.add(arr[i])
           arr[j] = arr[i]
           j += 1
        i += 1
    del arr[j:i]

This would still be O(n2) time if I deleted elements from the array as I iterated over it, because del arr[i] is itself an O(n) operation: it has to slide everything after i down one. del arr[j:i] where i is past the end of the array should be O(1), but if it really matters, you'd have to check the implementation.

And if you must have O(1) space, here's the O(n2) time algorithm:

def remove_duplicates_quadratic(arr):
    """Remove all duplicated elements from ARR in O(1) space and O(n²) time.
       ARR is modified in place and is not returned (like `sort`)."""
    n = len(arr)
    i = 0
    while i < n:
       for j in range(i):
           if arr[j] == arr[i]:
               del arr[i]
               n -= 1
               break
       else:
           i += 1

I'm doing manual array indexing in these because deleting elements from an array you're iterating over using any of the usual for..in constructs is unsafe. You could use a for..in loop in the first one, but you'd lose parallel structure so it would be less readable.

In case you haven't seen it before, an else clause on a loop is executed only if the loop is not exited via break.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • I had to look at this for quite a while to convince my self it works as intended .... that said +1 since it meets the O(1) space requirement ... – Joran Beasley Oct 09 '15 at 00:04
  • I think so far this is actually the only solution that meets OP's requirements .. – Joran Beasley Oct 09 '15 at 00:07
  • Looks good for O(1) space. Sorry that I initially had allegedly impossible requirements of O(1) space and O(n) time. I chose [Paul's answer](http://stackoverflow.com/a/33028026/1661745) because his last line `arr[index:] = []` looks more succint and I think it's O(1) time, versus O(i - j) for your `del arr[j:i]`. Anyways +1. – onepiece Oct 09 '15 at 00:50
  • `del arr[j:i]` *ought* to be O(1) in this case, but I haven't actually read through the code I linked to, to find out whether that's true. – zwol Oct 09 '15 at 00:53
0
from collections import OrderedDict

OrderedDict.fromkeys(my_array).keys()
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
0
pip install oset

import oset

ll = oset.oset(['a', 'c','a','b','b'])

print(ll)

OrderedSet(['a', 'c', 'b'])`
LetzerWille
  • 5,355
  • 4
  • 23
  • 26
0

To keep the speed efficient, you need to use O(n) space to record records that already exist. If there are a large amount of duplicates it won't take much space. However, the del operation on a standard array is O(n) because it has to shift all the following elements.

def uniquify(data):
    used = set()
    i = 0
    while i < len(data):
        if data[i] in used:
            del data[i] # or data.pop(i)
        else:
            used.add(data[i])
            i += 1
    return data # not strictly necessary because array is modified in-place
Darcinon
  • 159
  • 6
0

O(n) time, O(n) space

uniqueList = []
index = 0
while index < len(myList):
    item = myList[index]
    if item not in uniqueList: uniqueList.append(item)
    else: myList.pop(index)
    index += 1

Note: Do not modify a list you are iterating through like below:

for index, item in enumerate(myListCopy):
    if item not in uniqueList: uniqueList.append(item)
    else: myList.pop(index)

O(n^2) time, O(1) space It certainly is bounded by n^2, but in reality it's less.

index = 0
while index < len(myList):
    subIndex = index + 1
    while subIndex < len(myList):
        if myList[subIndex] == myList[index]: myList.pop(subIndex)
        subIndex += 1
    index += 1

O(n) time, O(1) space If your list is sorted and you don't mind removing the first duplicates.

for item in myList:
    if myList.count(item) > 1: myList.remove(item)
Aidan Gomez
  • 8,167
  • 5
  • 28
  • 51
0
lst1= ['a', 'c','a','b','b'] #-> ['a', 'c', 'b']
lst2 = []

for l in lst1:
   if l not in lst2:
       lst2.append(l)
 print lst2
SuperNova
  • 25,512
  • 7
  • 93
  • 64