Efficient multinomial sampling for sparse array/tensor in python

Question

I have a sparse array/tensor like below.

import torch
from torch_sparse import SparseTensor


row = torch.tensor([0, 0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3])
col = torch.tensor([1, 2, 3, 0, 2, 0, 1, 4, 5, 0, 2, 5, 2, 4])
value = torch.rand([14])
adj_t = SparseTensor(row=row, col=col, value=value, sparse_sizes=(4, 9))

I want to sample n_samples column index's with or without replacement. I can do this by first converting adj_t to dense and then using torch.multinomial or similarly with numpy.random.choice.

sample = torch.multinomial(adj_t.to_dense(), num_samples=2, replacement=True)

But converting the sparse array to dense and the torch.multinomial is not very efficient. Is there a sparse version of torch.multinomial. If no, how would one go about implementing it

score 0 · Answer 1 · answered Aug 01 '21 at 17:50

I am not sure if this can be done as efficiently as your one-liner.

From what I understand one way to achieve what you want is to:

Group values by row in which they appear in sparese tensor e.g. using this solution: np.split(value.numpy(), np.unique(row.numpy(), return_index=True)[1][1:])
Use numpy.random.multinominal to create the list of randomly chosen indexes for every row
map the indexes to respective values from col (i.e. 0 in 0th row is 1, 1 is 1st row is 2, 2 in 2nd row is 4 - all according to row and col values)

You might not want to use any built-in loop in order for performence not to drop.

Efficient multinomial sampling for sparse array/tensor in python

1 Answers1