I have many large 1D arrays and I'd like to grab the unique values. Typically, one could do:
x = np.random.randint(10000, size=100000000)
np.unique(x)
However, this performs an unnecessary sort of the array. The docs for np.unique
do not mention any way to retrieve the indices without sorting. Other answers with np.unique
include using return_index
but, as I understand it, the array is still being sorted. So, I tried using set
:
set(x)
But this is way slower than sorting the array with np.unique
. Is there a faster way to retrieve the unique values for this array that avoids sorting and is faster than np.unique
?