i'm trying to design a pretraind cnn + lstm network in order to video quality assessment. i have done this steps:
- extracting frames and their quality score.
- resizing them to 224*224
i want to use this code, it's related to video classification and i eager to change some lines to convert it to regression problem.
i have a question and want to you to help me for solving my question. the input of my network is number of frames for example 70 frames and my output of network will be the averaging of input frames scores. any body can help me to reform this codes according to my desire aim?
from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense, Input
from keras.layers.pooling import GlobalAveragePooling2D
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import TimeDistributed
from keras.optimizers import Nadam
frames = 70
channels= 3
rows = 224
columns = 224
classes = 3
video = Input(shape=(frames,rows,columns,channels))
cnn_base = VGG16(input_shape=(rows,columns,channels),weights="imagenet",include_top=False)
cnn_out = GlobalAveragePooling2D()(cnn_base.output)
cnn = Model(input=cnn_base.input, output=cnn_out)
cnn.trainable = False
encoded_frames = TimeDistributed(cnn)(video)
encoded_sequence = LSTM(256)(encoded_frames)
hidden_layer = Dense(output_dim=1024, activation="relu")(encoded_sequence)
outputs = Dense(output_dim=classes, activation="softmax")(hidden_layer)
model = Model([video], outputs)
optimizer = Nadam(lr=0.002,
beta_1=0.9,
beta_2=0.999,
epsilon=1e-08,
schedule_decay=0.004)
model.compile(loss="categorical_crossentropy",
optimizer=optimizer,
metrics=["categorical_accuracy"])
model.summary()