I am triying to configure an RNN neural netwwork in order to predict 5 different types of text entities. I am using the next configuration:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.iterations(100)
.updater(Updater.ADAM) //To configure: .updater(Adam.builder().beta1(0.9).beta2(0.999).build())
.regularization(true).l2(1e-5)
.weightInit(WeightInit.XAVIER)
.gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue).gradientNormalizationThreshold(1.0)
.learningRate(2e-2)
.trainingWorkspaceMode(WorkspaceMode.SEPARATE).inferenceWorkspaceMode(WorkspaceMode.SEPARATE) //https://deeplearning4j.org/workspaces
.list()
.layer(0, new GravesLSTM.Builder().nIn(500).nOut(3)
.activation(Activation.TANH).build())
.layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT).activation(Activation.SOFTMAX) //MCXENT + softmax for classification
.nIn(3).nOut(5).build())
.pretrain(false).backprop(true).build();
MultiLayerNetwork net = new MultiLayerNetwork(conf);
net.init();
I train it and then I evaluate it. It works. Nevertheless when I use:
int[] prediction = net.predict(features);
Sometimes it retuns and unexpected predictions. It returns correct predictions as 1,2....5 but sometimes it returns numbers as 9,14,12... This numbers not corresponds to an recognised prediction/label.
Why this configuration return unexpected outputs?