2

As the documentation of tf.dense states for this layer the output tensor is the same shape as inputs except the last dimension is of size units. I was trying to have similar behavior in Chainer but I was not successful.

In Tensorflow one can have a (32, 28, 28, 512) tensor and feed it as input to a linear layer and get a (32, 28, 28, 256). As I researched about the tf.dense, seems like when the input has more than 2 dimensions, it shares the weights and it doesn't flatten the input before performing the function.

The chainer.links.Linear does flatten the input and as a result, it does not fit in the memory. I was wondering if it's possible to have the same functionality as in tf.dense somehow in Chainer?

D3GAN
  • 642
  • 10
  • 26

1 Answers1

0

How about reshape the input before and after applying L.Linear?

import chainer.functions as F
import chainer.links as L

l = L.Linear(512, 256)

# x is (32, 28, 28, 512)
s0, s1, s2, s3 = x.shape
h= F.reshape(x, (s0*s1*s2, s3)
h = l(h)
h = F.reshape(x, (s0, s1, s2, 256))
# Now h should be (32, 28, 28, 256)
corochann
  • 1,604
  • 1
  • 13
  • 24
  • 1
    I think this would work too. but if someone wants to avoid all the reshaping, I figured that a one by one convolution can also be used since it shares the weights on the 28 * 28 axis. – D3GAN Dec 20 '18 at 17:20
  • Note that Chainer's convolution assumes (N, C, H, W) order, so convolution is applied on 1st axis (counting from 0) in default I guess. – corochann Dec 21 '18 at 01:38
  • @corochann It is not the case. The shape of 1D convolution's kernel is (C,), while the required is (C*H*W,). – Yuki Hashimoto Dec 24 '18 at 07:14