4

I am trying to feed the pixel vector to the convolutional neural network (CNN), where the pixel vector came from image data like cifar-10 dataset. Before feeding the pixel vector to CNN, I need to expand the pixel vector with maclaurin series. The point is, I figured out how to expand tensor with one dim, but not able to get it right for tensor with dim >2. Can anyone one give me ideas of how to apply maclaurin series of one dim tensor to tensor dim more than 1? is there any heuristics approach to implement this either in TensorFlow or Keras? any possible thought?

maclaurin series on CNN:

I figured out way of expanding tensor with 1 dim using maclaurin series. Here is how to scratch implementation looks like:

def cnn_taylor(input_dim, approx_order=2):
    x = Input((input_dim,))
    def pwr(x, approx_order):
        x = x[..., None] 
        x = tf.tile(x, multiples=[1, 1, approx_order + 1])
        pw = tf.range(0, approx_order + 1, dtype=tf.float32) 
        x_p = tf.pow(x, pw) 
        x_p = x_p[..., None]
        return x_p

    x_p = Lambda(lambda x: pwr(x, approx_order))(x)
    h = Dense(1, use_bias=False)(x_p)  
    def cumu_sum(h):
        h = tf.squeeze(h, axis=-1)  
        s = tf.cumsum(h, axis=-1) 
        s = s[..., None] 
        return s
    S = Lambda(cumu_sum)(h)

so above implementation is sketch coding attempt on how to expand CNN with Taylor expansion by using 1 dim tensor. I am wondering how to do same thing to tensor with multi dim array (i.e, dim=3).

If I want to expand CNN with an approximation order of 2 with Taylor expansion where input is a pixel vector from RGB image, how am I going to accomplish this easily in TensorFlow? any thought? Thanks

Jerry07
  • 929
  • 1
  • 10
  • 28
  • 1
    Correct me if I am wrong, but would the idea be to apply the "Taylor series expansion" independently for each RGB channel? That is, would it be an acceptable solution to flatten the image for each channel and then apply 3 independent transformations (one per channel)? – rvinas Apr 19 '20 at 19:35
  • @rvinas I just updated the computational graph of Taylor expansion of CNN with tensor with 1 dim, I am wondering how to implement if tensor with 3 dim is used for Taylor expansion. Your valuable effort and time will be appreciated. I will assign a bounty score for your kind help. Thank you – Jerry07 Apr 19 '20 at 21:00
  • @rvinas Hi Ramon, I think create flatten input to all RGB pixels then use `taylor_expansion_network` is not what I think of. Can we iterate RGB channel for Taylor expansion, such as we are gonna have 6 different expansion neurons with approximation order of 2, which certainly would have 6 different weights and get cummunulative sum of them? Does it possible? thank you – Jerry07 Apr 19 '20 at 21:36

1 Answers1

3

If I understand correctly, each x in the provided computational graph is just a scalar (one channel of a pixel). In this case, in order to apply the transformation to each pixel, you could:

  1. Flatten the 4D (b, h, w, c) input coming from the convolutional layer into a tensor of shape (b, h*w*c).
  2. Apply the transformation to the resulting tensor.
  3. Undo the reshaping to get a 4D tensor of shape (b, h, w, c)` back for which the "Taylor expansion" has been applied element-wise.

This could be achieved as follows:

shape_cnn = h.shape  # Shape=(bs, h, w, c)
flat_dim = h.shape[1] * h.shape[2] * h.shape[3]
h = tf.reshape(h, (-1, flat_dim))
taylor_model = taylor_expansion_network(input_dim=flat_dim, max_pow=approx_order)
h = taylor_model(h)
h = tf.reshape(h, (-1, shape_cnn[1], shape_cnn[2], shape_cnn[3]))

NOTE: I am borrowing the function taylor_expansion_network from this answer.


UPDATE: I still don't clearly understand the end goal, but perhaps this update brings us closer to the desired output. I modified the taylor_expansion_network to apply the first part of the pipeline to RGB images of shape (width, height, nb_channels=3), returning a tensor of shape (width, height, nb_channels=3, max_pow+1):

def taylor_expansion_network_2(width, height, nb_channels=3, max_pow=2):
    input_dim = width * height * nb_channels

    x = Input((width, height, nb_channels,))
    h = tf.reshape(x, (-1, input_dim))

    # Raise input x_i to power p_i for each i in [0, max_pow].
    def raise_power(x, max_pow):
        x_ = x[..., None]  # Shape=(batch_size, input_dim, 1)
        x_ = tf.tile(x_, multiples=[1, 1, max_pow + 1])  # Shape=(batch_size, input_dim, max_pow+1)
        pows = tf.range(0, max_pow + 1, dtype=tf.float32)  # Shape=(max_pow+1,)
        x_p = tf.pow(x_, pows)  # Shape=(batch_size, input_dim, max_pow+1)
        return x_p

    h = raise_power(h, max_pow)

    # Compute s_i for each i in [0, max_pow]
    h = tf.cumsum(h, axis=-1)  # Shape=(batch_size, input_dim, max_pow+1)

    # Get the input format back
    h = tf.reshape(h, (-1, width, height, nb_channels, max_pow+1))  # Shape=(batch_size, w, h, nb_channels, max_pow+1)

    # Return Taylor expansion model
    model = Model(inputs=x, outputs=h)
    model.summary()
    return model

In this modified model, the last step of the pipeline, namely the sum of w_i * s_i for each i, is not applied. Now, you can use the resulting tensor of shape (width, height, nb_channels=3, max_pow+1) in any way you want.

rvinas
  • 11,824
  • 36
  • 58
  • Hi, I think using a flatten dim to RGB channel would not be a good idea. Do you think can we iterate RGB channel in for loop then apply taylor_expansion_network?Thanks – Jerry07 Apr 19 '20 at 21:51
  • what I was trying is, fist I want to do taylor expansion on each RGB channel (i.e, with approx_order =2, Raise input x_i to power p_i, Multiply by alpha coefficients, Compute s_i for each i in [0, max_pow], but do cumulative sum w_i * s_i after iterating RGB channel with Taylor expansion. How do we accomplish this? I am gonna do bounty to give credit for your kind help. Thanks, Ramon – Jerry07 Apr 20 '20 at 04:20
  • Are the weights `w` and `alpha` shared across pixels/convolutions? I am afraid the question is still unclear to me – rvinas Apr 20 '20 at 17:45
  • 1
    Hi Ramon, for the clarification, weights and alpha will not be shared across pixels/convolutions. let's say, taylor expansion network for the pixel vectors from each RGB channel (approx_order =2), will going to have a different coefficient, weight. Your help would be highly appreciated. I will do a bounty score for your hard effort for this question shortly. – Jerry07 Apr 20 '20 at 17:53
  • @Dan in that case, since the weights are not shared, the "Taylor expansion" with the flat vector should yield the results you want, no? Because flattening the vector allows you to apply the transformation independently to each "RGB value" of the pixel. Also, when you talk about `x_i`, `w_i` and `s_i`, are they scalars or vectors? I am still struggling to understand what you want to achieve exactly. – rvinas Apr 22 '20 at 11:19
  • when I was think of `x_i`, `w_i` and `s_i` are vectors when we applied taylor expansion on pixel vectors from each RGB channel. I am not quite understand coding base of `flattening the vector allows you to apply the transformation independently to each "RGB value" of the pixel` ? Would it be possible to elaborate your point more specifically with coding demonstration? – Jerry07 Apr 22 '20 at 13:45
  • Hi Ramon, do you think is there any better way to handle RGB image with Taylor expansion? any further thought? thanks – Jerry07 Apr 24 '20 at 14:51
  • Hi Dan, I have just updated my answer, modifying the model to return a tensor of shape `(width, height, nb_channels=3, max_pow+1)`. You can now aggregate the last dimension however you want – rvinas Apr 25 '20 at 13:01