1

Hi,

i have a sequence looking like that (plus more zeros) :

[ 0, 0.66 , 0 ,0.66 ,0 ,0 ,0 ,0.55 ,0 ,0 ,0 ,3.18 ,0 ,0 ,2 ,0.6 ,0]

I have the following code in python in the same way than :

Pybrain time series prediction using LSTM recurrent nets

from pybrain.datasets import SequentialDataSet
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import LSTMLayer
from pybrain.supervised import RPropMinusTrainer
from itertools import cycle

ds = SequentialDataSet(1, 1)
for sample, next_sample in zip(train, cycle(train[1:])):
    ds.addSample(sample, next_sample)

net = buildNetwork(1, 5, 1, hiddenclass=LSTMLayer, outputbias=False, recurrent=True)
trainer = RPropMinusTrainer(net, dataset=ds)
train_errors = [] 
EPOCHS_PER_CYCLE = 5
CYCLES = 50
EPOCHS = EPOCHS_PER_CYCLE * CYCLES
for i in range(CYCLES):
    trainer.trainEpochs(EPOCHS_PER_CYCLE)
    train_errors.append(trainer.testOnData())
    epoch = (i+1) * EPOCHS_PER_CYCLE
    print("\r epoch {}/{}".format(epoch, EPOCHS), end="")
    stdout.flush()

Getting the prediction on the train set:

res=[]
for sample, target in ds.getSequenceIterator(0):
   r=net.activate(sample)
   res.append(r)

Then what i notice is that the network never predicts zeros, always something around 0.10. How should i tune my network in order to get the good results?

Thank you

Community
  • 1
  • 1
mva
  • 373
  • 1
  • 3
  • 8

2 Answers2

0

I don't have any experience with Pybrain so far - however I work with many similar ML package -, but as I see this is a regression task and not a classification. Thus the net never will provide 0 as result, but it will provide result is closer and closer to 0 or to any desired member of sequence. So you can get closer to 0 than 0.1 with increasing the

EPOCHS_PER_CYCLE = 5

or

CYCLES = 50 

and probably you will reach 0.01, then 0.0025 and so on. Please write me if you have further experience regarding this task.

Geeocode
  • 5,705
  • 3
  • 20
  • 34
  • Still getting 0.1 values when changing cycles or epoch :/ – mva Jul 22 '15 at 11:55
  • What is your exact output including the other results too? – Geeocode Jul 22 '15 at 11:57
  • Here is my train: http://pastebin.com/ejWxbLrG Here is the output: http://pastebin.com/HxtCy3fT Sorry i didn't find a better way to show you – mva Jul 22 '15 at 12:14
  • So I think you have to pass a normalize input data to the learner so, that its values should fall between 0 and 1. – Geeocode Jul 22 '15 at 12:52
  • sorry it doesn't improve the output. I think lstm is not the right model for that kind of problem. – mva Jul 22 '15 at 15:10
  • But what was the output and input? Could you send me again those one too to take look. – Geeocode Jul 22 '15 at 15:17
  • here is the train: http://pastebin.com/8VcybdUK here is the output: http://pastebin.com/mg6wL6e9 and here is the plot: http://i.imgur.com/aHpUocw.png – mva Jul 22 '15 at 16:04
0

Neural networks are known to be universal appropriators and given a data set will attempt to create an internal state that will represent the data set as best as possible. In essence trying to copy the patterns in the data through a complex formula.

The neural network will not predict you exactly zero because it is working on a continuous scale not an integer scale. Furthermore it is most likely predicting 0.1 on average because the majority of your targets are 0 while the rest are slightly positive skewing the activated output towards the positive.

If you want to tune your network I would recommend you hold some of the last values back from the training and use some as a validation set to find the correct amount of training epochs and hidden nodes. While using the last values as the test set to give a good estimation of the generalization error.

Currently it looks like you are training and testing on the same data which will give you extremely misleading estimates of future error if you wanted to predict more results in the sequence.

Note: I am not sure what "cycle" and "epochspercycle" in your training method. It seems like you are training for a few epochs aggregating the error then moving on to a new cycle. As opposed to running through the data set once for each epoch and outputting average error.

A. Dev
  • 350
  • 1
  • 2
  • 7