I am making a simple conv2d + dynamic reservoir (a customized recurrent layer with random / fixed connections that only outputs the last time step node states). The reservoir is written as a lambda layer to implement a simple equation as shown in the code. The model can be constructed by Keras.
I hope the model to be trained to classify some image sequences with a given batch size. (e.g. batch_size = 2) So ideally Keras should allocate batches of size 2x3x8x8x1 since the dataset is of the size 10x3x8x8x1. The time distributed Conv2d layer is supposed to return 2x3x6x6x3. The subsequent customized flatten layer should flat non-time dimensions and return 2x3x108. The reservior layer with 108 nodes should return 2x108. And the last read-out layer should return 2x5.
import keras
from keras.layers import Dense, Convolution2D, Activation, Lambda
from keras.layers.wrappers import TimeDistributed
from keras.models import Sequential
from keras import backend as K
import tensorflow as tf
import numpy as np
# Flatten the non-time dimensions
def flatten_tstep(x_in): # Input shape (None, 3, 6, 6, 3), Output shape (None, 3, 108)
shape = K.shape( x_in ) # tensor shape
x_out = K.reshape( x_in, [shape[0], shape[1], K.prod(shape[1:])] )
return x_out
def flatten_tstep_shape( x_shape ) :
n_batch, n_tsteps, n_rows, n_cols, n_filters = x_shape
output_shape = ( n_batch, n_tsteps, n_rows * n_cols * n_filters ) # Flatten
return output_shape
# Simple Reservior
# Use a single batch as an example, the input (size 3x108) is of 3 time steps to the 108 nodes in the reserivor.
# The states of the nodes are stat_neuron (size 1x108)
# For t in range(3)
# stat_neuron = stat_neuron * decay_coefficient + input[t, :] + recurrent_connection_matrix * stat_neuron
# End
# This layer effectively returns the states of the node in the last time step
def ag_reservior(x_in): # Input shape (None, 3, 108), Output shape (None, 108)
shape = K.shape( x_in ) # tensor shape
stat_neuron = K.zeros([shape[0], shape[2]]) # initialize Neuron states
t_step = tf.constant(0) # Initialize time counter, shifted by 1
t_max = tf.subtract(shape[1], tf.constant(1)) # Maximum time steps, shifted by 1
x = x_in
def cond(t_step, t_max, stat_neuron, x):
return tf.less(t_step, t_max)
def body(t_step, t_max, stat_neuron, x):
global RC_MATRIX, C_DECAY # Connection matrix, decay constant
temp = tf.scalar_mul(C_DECAY, stat_neuron) # stat_neuron * decay_coefficient
temp = tf.add(temp, x[:, t_step, :]) # stat_neuron * decay_coefficient + input[t, :]
temp = tf.add(temp, tf.einsum('ij,bj->bi', RC_MATRIX, stat_neuron)) # out[batch,i]=sum_j RC_MATRIX[i,j]*stat_neuron[batch,j]
return [tf.add(t_step, 1), t_max, temp, x]
res = tf.while_loop(cond, body, [t_step, t_max, stat_neuron, x])
return res[2]
def ag_reservior_shape( x_shape ) :
in_batch, in_tsteps, in_nodes = x_shape
output_shape = ( in_batch, in_nodes )
return output_shape
#%% Parameters
n_sample = 10; # number of samples;
n_tstep = 3; # number of time steps per sample
n_row = 8; # number of rows per frame
n_col = 8; # number of columns per frame
n_channel = 1; # number of channel
RC_MATRIX = K.random_normal([108, 108]) # Reservior layer node recurrent connection matrix, note there are 108 nodes
C_DECAY = K.constant(0.9) # Recurrent layer node time-to-time decay coefficient
data = K.random_normal([n_sample, n_tstep, n_row, n_col, 1]) # Some random dataset
# data = np.random.randn(n_sample, n_tstep, n_row, n_col, 1)
label = np.random.randint(5, size=n_sample) # Some random dataset labels
label_onehot = K.one_hot(label, 5)
x_train = data
y_train = label_onehot
x_test = data
y_test = label_onehot
#%% Model
model=Sequential();
# Convolution Kernels: Input shape (batch_size, 3, 8, 8, 1), Output shape (batch_size, 3, 6, 6, 3)
model.add(TimeDistributed(Convolution2D(3, (3, 3), strides=1, padding='valid', use_bias=False,
kernel_initializer='random_uniform', trainable=False), input_shape = (n_tstep, n_row, n_col, n_channel)))
# Flatten non-time dimensions: Input shape (batch_size, 3, 6, 6, 3), Output shape (batch_size, 3, 108)
model.add(Lambda(flatten_tstep, output_shape = flatten_tstep_shape))
# Reservior: Input shape (batch_size 3, 108), Output shape (batch_size, 108)
model.add(Lambda(ag_reservior, output_shape = ag_reservior_shape))
# Reservior Read-out: Input shape (batch_size, 108), Output shape (batch_size, 5)
model.add(Dense(5, use_bias=False))
model.add(Activation('softmax'))
# Check model
model.summary()
#%% Training
opt = keras.optimizers.rmsprop(lr = 0.01, decay = 1e-6)
model.compile(loss='categorical_crossentropy', optimizer = opt, metrics = ['acc'])
history = model.fit(x_train, y_train, epochs = 50, validation_data = (x_test, y_test), batch_size = 2)
However, Keras said "If your data is in the form of symbolic tensors, you should specify the steps_per_epoch
argument (instead of the batch_size
argument, because symbolic tensors are expected to produce batches of input data)."
Could you advise on how to let Keras correctly recognize the batch size and proceed to the training? (Note that the Conv2d layer is fixed, the lambda layers are also fixed, only the last dense layer needs training.)
Thank you in advance.