I don't think this is possible in O(n) time and O(1) space. To get O(n) time you have to have a scratchpad (with O(1) insert and lookup, e.g. a set
) recording all the elements you've already seen; that requires O(m) space (where m is the number of unique elements). Conversely, to get O(1) space you have to rescan earlier elements in the array for each element, which is O(n2) time.
If you'll accept an O(m) scratchpad, this fulfills the "modify the array in place" requirement and runs in O(n) time.
def remove_duplicates(arr):
"""Remove all duplicated elements from ARR in O(m) space and O(n) time.
ARR is modified in place and is not returned (like `sort`)."""
n = len(arr)
if n <= 1: return
seen = set((arr[0],))
i = 1
j = 1
while i < n:
if arr[i] not in seen:
seen.add(arr[i])
arr[j] = arr[i]
j += 1
i += 1
del arr[j:i]
This would still be O(n2) time if I deleted elements from the array as I iterated over it, because del arr[i]
is itself an O(n) operation: it has to slide everything after i
down one. del arr[j:i]
where i
is past the end of the array should be O(1), but if it really matters, you'd have to check the implementation.
And if you must have O(1) space, here's the O(n2) time algorithm:
def remove_duplicates_quadratic(arr):
"""Remove all duplicated elements from ARR in O(1) space and O(n²) time.
ARR is modified in place and is not returned (like `sort`)."""
n = len(arr)
i = 0
while i < n:
for j in range(i):
if arr[j] == arr[i]:
del arr[i]
n -= 1
break
else:
i += 1
I'm doing manual array indexing in these because deleting elements from an array you're iterating over using any of the usual for..in constructs is unsafe. You could use a for..in loop in the first one, but you'd lose parallel structure so it would be less readable.
In case you haven't seen it before, an else
clause on a loop is executed only if the loop is not exited via break
.