1

I came across the following code and was wondering what exactly does keras.layers.concatenate do in this case.

Best Guess:

  1. In fire_module(), y learns based on every pixel(kernel_size=1)
  2. y1 learns based on every pixel of the activation map of y(kernel_size=1)
  3. y3 learns based on an area of 3x3 pixels of activation map of y(kernel_size=3)
  4. concatenate puts y1 and y3 together, meaning total filters is now the sum of filters in y1 andy3
  5. This concatenation is an average of, learning based on every pixel, learning based on 3x3, both based on a previous activation map based on every pixel, making model better?

Any help is greatly appreciated.

def fire(x, squeeze, expand):
    y  = Conv2D(filters=squeeze, kernel_size=1, activation='relu', padding='same')(x)
    y  = BatchNormalization(momentum=bnmomemtum)(y)
    y1 = Conv2D(filters=expand//2, kernel_size=1, activation='relu', padding='same')(y)
    y1 = BatchNormalization(momentum=bnmomemtum)(y1)
    y3 = Conv2D(filters=expand//2, kernel_size=3, activation='relu', padding='same')(y)
    y3 = BatchNormalization(momentum=bnmomemtum)(y3)
    return concatenate([y1, y3])

def fire_module(squeeze, expand):
    return lambda x: fire(x, squeeze, expand)
x = Input(shape=[144, 144, 3])
y = BatchNormalization(center=True, scale=False)(x)
y = Activation('relu')(y)
y = Conv2D(kernel_size=5, filters=16, padding='same', use_bias=True, activation='relu')(x)
y = BatchNormalization(momentum=bnmomemtum)(y)

y = fire_module(16, 32)(y)
y = MaxPooling2D(pool_size=2)(y)

Edit:

To be a little more specific, why not have this:

# why not this?
def fire(x, squeeze, expand):
    y  = Conv2D(filters=squeeze, kernel_size=1, activation='relu', padding='same')(x)
    y  = BatchNormalization(momentum=bnmomemtum)(y)
    y = Conv2D(filters=expand//2, kernel_size=1, activation='relu', padding='same')(y)
    y = BatchNormalization(momentum=bnmomemtum)(y)
    y = Conv2D(filters=expand//2, kernel_size=3, activation='relu', padding='same')(y)
    y = BatchNormalization(momentum=bnmomemtum)(y)
    return y

nicgh3
  • 25
  • 3

1 Answers1

0

I'm citing @parsethis from this stack question when he explained concatenation, this is what it does if a is concatenated to b (results are joined together) :

    a        b         c
a b c   g h i    a b c g h i
d e f   j k l    d e f j k l

The documentation says that it simply returns a tensor containing the concatenation of all inputs, provided they have share one dimension (i.e. same length or witdh, depending on axis)

What happened in your case seems like this :

Y 
 \
  Y1----
   \    |
    Y3  Y1

I hope I was clear enough

Wajd Meskini
  • 94
  • 1
  • 6
  • 1
    Thanks for the explanation. By the way isn't this what happened in my case: Both y1 and y3 branch from y then concatenated as you described and returned into y? – nicgh3 Feb 24 '20 at 15:57
  • Honestly that's what I think happened. Y1 branches off from Y, Y3 branches off from Y1, then Y1's output (that's also the input of Y3) gets appended to the output of Y3. Some kind of skip connection if you will. As far as I understand, it helps the model to remember what happened in Y1 in case it grows too deep and forgets after Y3 – Wajd Meskini Feb 24 '20 at 20:05
  • 1
    I think I see the confusion. Let me try to explain it this way. Y = fire_module()(Y). Inside fire(), y learns from Y, y1 branches from y, y3 branches from y, y3 is appended to y1 and this appended layer goes into Y. So every time fire_module is called, layer learns pointwise once(y), the average of learning pointwise(y1) and in area of 3(y3) is added to the first pointwise learning(y). – nicgh3 Feb 25 '20 at 11:43