I have an array of numbers. I want to sort them and remove duplicates. This answer suggest to use set
and sort
for that kind of operation. The order of operations shouldn't change the results, so I measured the time of computations.
from numpy.random import randint
from time import clock
n = 1000000
A=randint(0,n,n)
t1=clock()
B=sorted(set(A))
t2=clock()
C=set(sorted(A))
t3=clock()
print t2-t1, t3-t2
>>> 0.48011 1.339263
sorted(set(A)) is roughly three times faster than set(sorted(A)).
What makes the one faster than the other? Also, is there any faster way to do it?