1

So, I'm trying to create a custom layer in TensorFlow 2.4.1, using a function for a neuron I defined.

# NOTE: this is not the actual neuron I want to use,
# it's just a simple example.
def neuron(x, W, b):
    return W @ x + b

Where the W and b it gets would be of shape (1, x.shape[0]) and (1, 1) respectively. This means this is like a single neuron in a dense layer. So, I want to create a dense layer by stacking however many of these individual neurons I want.

class Layer(tf.keras.layers.Layer):
    def __init__(self, n_units=5):
        super(Layer, self).__init__() # handles standard arguments
        
        self.n_units = n_units # Number of neurons to be in the layer

    def build(self, input_shape):
        # Create weights and biases for all neurons individually
        for i in range(self.n_units):
            # Create weights and bias for ith neuron
            ...

    def call(self, inputs):
        # Compute outputs for all neurons
        ...
        # Concatenate outputs to create layer output
        ...
        return output

How can I create a layer as a stack of individual neurons (also in a way it can train)? I have roughly outlined the idea for the layer in the above code, but the answer doesn't need to follow that as a blueprint.


Finally; yes I'm aware that to create a dense layer you don't need to go about it in such a roundabout way (you just need 1 weight and bias matrix), but in my actual use case, this is neccessary. Thanks!

MartinM
  • 209
  • 2
  • 10
  • 1
    *if* you need the same function for all the neurons in the same layer, why not using an `activation` [layer](https://keras.io/api/layers/activations/) and [defining your own activation function?](https://keras.io/api/layers/activations/#creating-custom-activations). You just need to define a function that takes a tensor and returns a tensor (your `neuron` function), but you would avoid having to take care of defining a full new layer. Of course, if every neuron has a different activation function, this will not work. Then I would not know, I would follow the question out of interest :) – freerafiki Jun 13 '21 at 11:10
  • @elgordorafiki Interesting suggestion! However, an activation function can't use trainable parameters (if I'm not mistaken), which this "function" must have. – MartinM Jun 13 '21 at 11:19
  • actually I think it can (at least up to a certain extent), did you check [this question](https://stackoverflow.com/questions/49923958/tensorflow-custom-activation-function)? If you implement it following the tensorflow guidelines, you also can exploit [automatic differentation](https://alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/) which would spare you the effort to implement derivative of your function. But yes, most likely there is a limit, and if your function is too special/particular, you may have to implement all yourself. – freerafiki Jun 13 '21 at 11:24
  • Hmm, I'm still not convinced it can. That question isn't using Keras Model or Layer classes, so they can really do what they want for an "activation function". Activation functions in normal Keras Models are simply transofmations. [Here](https://datascience.stackexchange.com/a/66358/111961)'s something showing this. I think it needs to be done as a Layer. – MartinM Jun 13 '21 at 11:40
  • okay. As said, this work up to a certain extent and if you really need to do many more advanced calculations (which I could not tell because function is not in the question) you may have to implement it yourself. But then I am not sure, if you implement it following the answer showing that, and then you have a dense layer `Dense(N, activation='neuron')` with `N` as your number of units and `neuron` as your defined Layer, isn't this already the solution you are looking for? – freerafiki Jun 13 '21 at 12:00
  • Okay, so to make this work I would need to define a `DoNothing` custom layer, which would just return `N` copies of the input. Then, I could apply an activation function I define, which applies my neuron (with it's trainable parameters). This is then like applying the neuron to each `DoNothing`, turning it into that neuron. However, each neuron needs a different initilization of the trainable parameters; they need to be independant. So then, this is basically "every neuron needs a different activation function". Sorry if the example I gave in the question was poor. – MartinM Jun 13 '21 at 13:16
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/233717/discussion-between-elgordorafiki-and-martinm). – freerafiki Jun 13 '21 at 14:59

1 Answers1

0

So, person who asked this question here, I have found a way to do it, by dynamically creating variables and operations.
First, let's re-define the neuron to use tensorflow operations:

def neuron(x, W, b):
    return tf.add(tf.matmul(W, x), b)

Then, let's create the layer (this uses the blueprint layed out in the question):

class Layer(tf.keras.layers.Layer):
    def __init__(self, n_units=5):
        super(Layer, self).__init__()

        self.n_units = n_units

    def build(self, input_shape):
        for i in range(self.n_units):
            exec(f'self.kernel_{i} = self.add_weight("kernel_{i}", shape=[1, int(input_shape[0])])')
            exec(f'self.bias_{i} = self.add_weight("bias_{i}", shape=[1, 1])')

    def call(self, inputs):
        for i in range(self.n_units):
            exec(f'out_{i} = neuron(inputs, self.kernel_{i}, self.bias_{i})')
        return eval(f'tf.concat([{", ".join([ f"out_{i}" for i in range(self.n_units) ])}], axis=0)')

As you can see, we're using exec and eval to dynamically create variables and perform operations.
That's it! We can perform a few checks to see if TensorFlow could use this:

# Check to see if it outputs the correct thing
layer = Layer(5) # With 5 neurons, it should return a (5, 6)
print(layer(tf.zeros([10, 6])))

# Check to see if it has the right trainable parameters
print(layer.trainable_variables)

# Check to see if TensorFlow can find the gradients
layer = Layer(5)
x = tf.ones([10, 6])
with tf.GradientTape() as tape:
    z = layer(x)
print(f"Parameter: {layer.trainable_variables[2]}")
print(f"Gradient:  {tape.gradient(z, layer.trainable_variables[2])}")

This solution works, but it's not very elegant... I wonder if there's a better way to do it, some magical TF method that can map the neuron to create a layer, I'm too inexperienced to know for the moment. So, please answer if you have a (better) answer, I'll be happy to accept it :)

MartinM
  • 209
  • 2
  • 10