Is there a theano operation equivalent to numpy "broadcast_to" method?

Question

Since I need to repeat over a specific axis, I want to avoid unnecessary memory reallocation as much as possible.

For example, given a numpy array A of shape (3, 4, 5), I want to create a view named B of shape (3, 4, 100, 5) on the original A. The 3rd axis of A is repeated 100 times.

In numpy, this can be achieved like this:

    B=numpy.repeat(A.reshape((3, 4, 1, 5)), repeats=100, axis=2)

or:

    B=numpy.broadcast_to(A.reshape((3, 4, 1, 5)), repeats=100, axis=2)

The former allocates a new memory and then do some copy stuff, while the latter just create a view over A without extra memory reallocation. This can be identified by the method described in the answer Size of numpy strided array/broadcast array in memory? .

In theano, however, the theano.tensor.repeat seems to be the only way, of course it's not preferable.

I wonder if there is a `numpy.broadcast_to' like theano method can do this in an efficient way?

score 0 · Answer 1 · answered Oct 13 '16 at 11:46

There is a nice method dimshuffle, which makes a theano variable broadcastable over some dimension

At = theano.tensor.tensor3()
Bt = At.dimshuffle(0,1,'x',2)

Now you've got a tensor variable with shape (3,4,'x',5), where 'x' means any dimension you want to add.

Ct=theano.tensor.zeros((Bt.shape[0],Bt.shape[1],100,Bt.shape[3]))+Bt

Example

f=theano.function([At],[Bt,Ct])
A = np.random.random((3,4,5)).astype(np.float32)
B,C=f(A)
print B.shape
print C.shape

(3, 4, 1, 5)

(3, 4, 100, 5)

Unless specified, it's better to work with variable Bt.

First of all, thank you for your answer. But in this way, `Ct` still needs extra memory storage for that newly created zero 4d-tensor and its memory efficiency seems to be the same with theano.repeat operation. — Tqri, Oct 13 '16 at 14:22

Is there a theano operation equivalent to numpy "broadcast_to" method?

1 Answers1