I don't know how pytorch implements topk
for CPU tensors. However, since you are working on CPU, you can use existing partial sorting implementations for numpy arrays.
For example, using the bottleneck.argpartition
:
import bottleneck
with torch.no_grad():
topi = bottleneck.argpartition(outline.numpy(), kth=beam_size)
topv = outline[topi] # allow gradients to propagate through indexing
Note that the efficient implementation of bottleneck.argpartition
does not sort the array and therefore you are guaranteed that topv
are indeed larger than all other elements of the array, but the values in topv
are not sorted and it might be the case that topv[i] < topv[i+1]
for some i
s.