2

Sorry, I am new to deep learning and keras. I am trying to define a layer myself.

I looked into the keras document, https://keras.io/api/layers/base_layer/#layer-class

class SimpleDense(Layer):

  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),
        trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(
        initial_value=b_init(shape=(self.units,), dtype='float32'),
        trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      return tf.matmul(inputs, self.w) + self.b

# Instantiates the layer.
linear_layer = SimpleDense(4)

I understand when I create linear_layer, the __init__ method is called, and when I put inputs into linear_layer, the call method is called. But I don’t get when the build method is called, more specifically, how is input_shape in build method specified? What is the input_shape here? I don’t know when the build method is called so I don’t know what arguments are put in as input_shape argument.

Besides, I want to specify a parameter with a fixed size, which is (1,768) in my case. So in this case, should I still use input_shape in build method?

R__
  • 87
  • 5

1 Answers1

3

To know about this SimpleDense layer and answer your questions, we need to explain weight and bias. weight in SimpleDense first gets random numbers and bias gets zero numbers and in the training of the model, this weight and bias change to minimize the loss. The answer to First Question: The build method only one-time calls, and in the first use of layer, this method is calling, and The weight and bias are set to random and zero numbers but The call method in each training batch is calling. The answer to the Second Question: Yes, in the call methods, we have access to a batch of data and the first dimension shows the batch. I write an example that print when the build and call method is calling and print the shape of input and output data to clarify the above explanation.

In the below example :

  1. I use batch_size = 5 and 25 sample data, and in each epoch, we can see in the call method access to 5 sample data.
  2. one-time layer create and build and one-time build method is calling, 5 epoch and 5-time call method is calling.
  3. Units = 4 and shape data = (100, 2) [sample, features] then total params = 12 <-> 4*2 (weights*features) + 4 (bias)
  4. Add the end, attach one image that shows how is matmul working and why the output shape is (5,4), and the formula for computing of intput*weight+bias.
import tensorflow as tf

class SimpleDense(tf.keras.layers.Layer):
  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    tf.print('calling build method')
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(initial_value=b_init(shape=(self.units,), 
                                              dtype='float32'),trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      tf.print('\ncalling call method')
      tf.print(f'input shape : {inputs.shape}')
      out = tf.matmul(inputs, self.w) + self.b
      tf.print(f'output shape : {out.shape}')
      return out

model = tf.keras.Sequential()
model.add(SimpleDense(units = 4))
model.compile(optimizer = 'adam',loss = 'mse',)
model.fit(tf.random.uniform((25, 2)), tf.ones((25, 1)), batch_size = 5)
model.summary()

Output:

calling build method

calling call method
input shape : (5, 2)
output shape : (5, 4)
1/5 [=====>........................] - ETA: 1s - loss: 0.9794
calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)
5/5 [==============================] - 0s 15ms/step - loss: 0.9770
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 simple_dense (SimpleDense)  (5, 4)                    12        
                                                                 
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________

enter image description here

I'mahdi
  • 23,382
  • 5
  • 22
  • 30
  • 1
    Thank you very much. I want to know when ```build``` is called because I don't know what is fed into the ```input_shape``` argument, because I didn't see it is specified anywhere. So does that mean the layer will automatically detect the input shape is (5,2) and feed it into the ```input_shape``` argument? What if there are multiple inputs? For example, a layer with input x1, x2 – R__ Jun 18 '22 at 14:22
  • @LangREN, only one time, when creating the model. here in the first epoch of model.fit() when data pass to model, build is calling and have `(5,2)` – I'mahdi Jun 18 '22 at 14:24
  • @LangREN, you can run and test your code in your question, and only write `tf.print('calling build method')` in the `def build(self, input_shape)` and see nothing printing, because the layer doesn't build and use and doesn't have data for the layer. – I'mahdi Jun 18 '22 at 14:28
  • @LangREN, OK, You can not input this data directly to `SimpleDense`, You can do two things, this layer only can input `1D` data in each row. you can write data like (25, 6*4) with reshaping (-1, 24) or changing your custom layer and define 2d weight and bias. `(25, 24) * (24,4) -> (25, 4)` – I'mahdi Jun 18 '22 at 14:46
  • Can I ask one little issue about the ```build``` method? I think it is still strongly related to my original question. When I add an line ```print(input_shape)``` in the first line of the ```build``` method, and I added the ```call``` method to accept two inputs x1 and x2, it only prints the input shape of x1, seems to neglect the input shape of x2. I wonder if this would suffice to make another individual question. – R__ Jun 18 '22 at 15:51