Within a loop that collects some samples, I need to obtain some statistics about their sorted indices every now and then, for which argsort returns exactly what I need. However, each iteration adds only a single sample, and it is a huge waste of resources to keep passing the whole samples array to the argsort function, especially since the samples array is very huge. Is not there an incremental efficient technique equivalent to argsort?
I believe an efficient incremental argsort function can be implemented by maintaining an ordered list of samples, which can be searched for the proper insertion indices once a new sample arrives. Such indices can be then utilized to both maintain the order of the samples list as well as to generate the incremental argsort-like desired output.
So far, I have utilized the searchsorted2d function by @Divakar, with slight modifications to obtain the insertion indices, and built some routine that can get the desired output if it is called after each sample insertion (b = 1
).
Yet, this is inefficient, and I would like to call the routine after the collection of kth samples (e.g. b = 10
). In the case of bulk insertions, searchsorted2d
seems to return incorrect indices, and that is were I stopped!
import time
import numpy as np
# By Divakar
# See https://stackoverflow.com/a/40588862
def searchsorted2d(a, b):
m, n = a.shape
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = max_num * np.arange(m)[:,np.newaxis]
p = np.searchsorted((a + r).ravel(), (b + r).ravel()).reshape(b.shape)
return p #- n * (np.arange(m)[:,np.newaxis])
# The following works with batch size b = 1,
# but that is not efficient ...
# Can we make it work for any b > 0 value?
class incremental(object):
def __init__(self, shape):
# Express each row offset
self.ranks_offset = np.tile(np.arange(shape[1]).reshape(1, -1),
(shape[0], 1))
# Storage for sorted samples
self.a_sorted = np.empty((shape[0], 0))
# Storage for sort indices
self.a_ranks = np.empty((shape[0], 0), np.int)
def argsort(self, a):
if self.a_sorted.shape[1] == 0: # Use np.argsort for initialization
self.a_ranks = a.argsort(axis=1)
self.a_sorted = np.take_along_axis(a, self.a_ranks, 1)
else: # In later itterations,
# searchsorted the input increment
indices = searchsorted2d(self.a_sorted, a)
# insert the stack pos to track the sorting indices
self.a_ranks = np.insert(self.a_ranks, indices.ravel(),
self.ranks_offset.ravel() +
self.a_ranks.shape[1]).reshape((n, -1))
# insert the increments to maintain a sorted input array
self.a_sorted = np.insert(self.a_sorted, indices.ravel(),
a.ravel()).reshape((n, -1))
return self.a_ranks
M = 1000 # number of iterations
n = 16 # vector size
b = 10 # vectors batch size
# Storage for samples
samples = np.zeros((n, M)) * np.nan
# The proposed approach
inc = incremental((n, b))
c = 0 # iterations counter
tick = time.time()
while c < M:
if c % b == 0: # Perform batch computations
#sample_ranks = samples[:,:c].argsort(axis=1)
sample_ranks = inc.argsort(samples[:,max(0,c-b):c]) # Incremental argsort
######################################################
# Utilize sample_ranks in some magic statistics here #
######################################################
samples[:,c] = np.random.rand(n) # collect a sample
c += 1 # increment the counter
tock = time.time()
last = ((c-1) // b) * b
sample_ranks_GT = samples[:,:last].argsort(axis=1) # Ground truth
print('Compatibility: {0:.1f}%'.format(
100 * np.count_nonzero(sample_ranks == sample_ranks_GT) / sample_ranks.size))
print('Elapsed time: {0:.1f}ms'.format(
(tock - tick) * 1000))
I would expect 100% compatibility with the argsort function, yet it need to be more efficient than calling argsort. As for execution time with an incremental approach, it seems that 15ms or so should be more than enough for the given example. So far, only one condition of these two can be met with any of the explored techniques.
To make a long story short, the shown above algorithm seems to be a variant of an order-statistic tree to estimate the data ranks, but it fails to do so when samples are added in bulk (
b > 1
). So far, it only works when inserting samples one by one (b = 1
). However, the arrays are copied every time insert
is called, which causes a huge overhead and forms a bottleneck, therefore samples shall be added in bulks rather than individually.
Can you introduce more efficient incremental argsort algorithm, or at least figure out how to support bulk insertion (b > 1
) in the above one?
If you choose to start from where I stopped, then the problem can be reduced to fixing the bug in following snapshot:
import numpy as np
# By Divakar
# See https://stackoverflow.com/a/40588862
def searchsorted2d(a, b):
m, n = a.shape
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = max_num * np.arange(m)[:,np.newaxis]
p = np.searchsorted((a + r).ravel(), (b + r).ravel()).reshape(b.shape)
# It seems the bug is around here...
#return p - b.shape[0] * np.arange(b.shape[1])[np.newaxis]
#return p - b.shape[1] * np.arange(b.shape[0])[:,np.newaxis]
return p
n = 16 # vector size
b = 2 # vectors batch size
a = np.random.rand(n, 1) # Samples array
a_ranks = a.argsort(axis=1) # Initial ranks
a_sorted = np.take_along_axis(a, a_ranks, 1) # Initial sorted array
new_data = np.random.rand(n, b) # New block to append into the samples array
a = np.hstack((a, new_data)) #Append new block
indices = searchsorted2d(a_sorted, new_data) # Compute insertion indices
ranks_offset = np.tile(np.arange(b).reshape(1, -1), (a_ranks.shape[0], 1)) + a_ranks.shape[1] # Ranks to insert
a_ranks = np.insert(a_ranks, indices.ravel(), ranks_offset.ravel()).reshape((n, -1)) # Insert ransk according to their indices
a_ransk_GT = a.argsort(axis=1) # Ranks ground truth
mask = (a_ranks == a_ransk_GT)
print(mask) #Why they are not all True?
assert(np.all(mask)), 'Oops!' #This should not fail, but it does :(
It seems the bulk insertion is more involved that what I initially thought, and that
searchsorted2d
is not to be blamed. Take the case of a sorted array a = [ 1, 2, 5 ]
, and two new elements block = [3, 4]
to be inserted. If we iterate and insert, then np.searchsorted(a, block[i])
would return [2]
and [3]
, and that is just OK. However, if we call np.searchsorted(a, block)
(the desired behavior - equivalent to iteration without insertion), we would get [2, 2]
. This is problematic for implementing an incremental argsort
, since even np.searchsorted(a, block[::-1])
would result in the same. Any idea?