Keras defines separate activation layers for the most common use cases, including LeakyReLU
, ThresholdReLU
, ReLU
(which is a generic version that supports all ReLU parameters), among others. See the full documentation here: https://keras.io/api/layers/activation_layers
Example usage with the Sequential model:
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(10))
model.add(tf.keras.layers.Dense(16))
model.add(tf.keras.layers.LeakyReLU(0.2))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation(tf.keras.activations.sigmoid))
model.compile('adam', 'binary_crossentropy')
If the activation parameter you want to use is unavailable as a predefined class, you could use a plain lambda
expression as suggested by @Thomas Jungblut:
from tensorflow.keras.layers import Activation
model.add(Activation(lambda x: tf.keras.activations.relu(x, alpha=0.2)))
However, as noted by @leenremm in the comments, this fails when trying to save or load the model. As suggested you could use the Lambda
layer as follows:
from tensorflow.keras.layers import Activation, Lambda
model.add(Activation(Lambda(lambda x: tf.keras.activations.relu(x, alpha=0.2))))
However, the Lambda
documentation includes the following warning:
WARNING: tf.keras.layers.Lambda
layers have (de)serialization limitations!
The main reason to subclass tf.keras.layers.Layer
instead of using a Lambda
layer is saving and inspecting a Model. Lambda
layers are saved by serializing the Python bytecode, which is fundamentally non-portable. They should only be loaded in the same environment where they were saved. Subclassed layers can be saved in a more portable way by overriding their get_config
method. Models that rely on subclassed Layers are also often easier to visualize and reason about.
As such, the best method for activations not already provided by a layer is to subclass tf.keras.layers.Layer
instead. This should not be confused with subclassing object
and overriding __call__
as done in @Anonymous Geometer's answer, which is the same as using a lambda
without the Lambda
layer.
Since my use case is covered by the provided layer classes, I'll leave it up to the reader to implement this method. I am making this answer a community wiki in the event anyone would like to provide an example below.