0

I've just tried PyBrain and hoped it could learn the simple linear function f(x) = 4x+1:

# Build the network
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(1, 2, 1, bias=True)

# Add samples
from pybrain.datasets import SupervisedDataSet
ds = SupervisedDataSet(1, 1)
for x in range(1000):
    ds.addSample((x, ), (4*x+1,))

# Train with samples
from pybrain.supervised.trainers import BackpropTrainer
trainer = BackpropTrainer(net, ds)
for i in range(100):
    error = trainer.train()
    print("Error: %0.2f" % error)

# See if it remembers 
print("Test function f(x)=4x+1")
for i in range(10):
    print("f(%i) = %i" % (i, net.activate((i, ))))

But when I execute this I get horrible wrong results:

f(0) = 1962
f(1) = 1962
f(2) = 1962
f(3) = 1962
f(4) = 1962
f(5) = 1962
f(6) = 1962
f(7) = 1962
f(8) = 1962
f(9) = 1962

Why doesn't this work?

Try 2

Code:

# Build the network
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(1, 2, 1, bias=True)

# Add samples
from pybrain.datasets import SupervisedDataSet
ds = SupervisedDataSet(1, 1)
for x in range(1000):
    ds.addSample((x, ), (4*x+1,))
    ds.addSample((x, ), (4*x+1,))

# Train with samples
from pybrain.supervised.trainers import BackpropTrainer
trainer = BackpropTrainer(net, ds, learningrate=0.001, momentum=0.99)

print("Start training")
a = trainer.trainUntilConvergence(dataset=ds,
                                  maxEpochs=100,
                                  verbose=True,
                                  continueEpochs=10,
                                  validationProportion=0.1)
print(a)
print("Finished training")

# See if it remembers
print("Test function f(x)=4x+1")
for i in range(10):
    print("f(%i) = %i" % (i, net.activate((i, ))))

Output:

Start training
train-errors: [  827395.411895  755443.286202  722073.904381  748336.584579 
[...]
695939.638106  726953.086185  736527.150008  739789.458146  736074.235677  731222.936020  675937.725009]
valid-errors: [  2479217.507148  915115.526570  703748.266402  605613.979311  592809.132542  686959.683977  612248.174146  
[...]
655606.225724  637762.864477  643013.094767  620825.083765  609063.451602  607935.458244  716839.447374]
([827395.41189463751, 755443.28620243724, 722073.90438077366, 748336.58457926242, 739568.58919456392, 725496.58682491502, 
[...]
637762.86447708646, 643013.09476733557, 620825.08376532339, 609063.45160197129, 607935.45824447344, 716839.44737418776])
Finished training
Test function f(x)=4x+1
f(0) = 1955
f(1) = 1955
f(2) = 1955
f(3) = 1955
f(4) = 1955
f(5) = 1955
f(6) = 1955
f(7) = 1955
f(8) = 1955
f(9) = 1955
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • I ran your code and after the 100 iterations of training I still had an error of 671650.61 which seems pretty huge. Did you try training the network more often? You can also try the method trainUntilConvergence() with maxEpochs = 1000 for example. See http://pybrain.org/docs/api/supervised/trainers.html#pybrain.supervised.trainers.BackpropTrainer.trainUntilConvergence – marktani Jun 10 '14 at 16:31
  • @mcwise: Yes, I've tried that. With similar error rates and results :-/ – Martin Thoma Jun 10 '14 at 16:41
  • Did you try a smaller learning rate and/or bigger dataset? Read this answer for more insight and possible problems http://stackoverflow.com/a/20486148/1176596 – marktani Jun 10 '14 at 16:42
  • I think my approach was conceptionally wrong. I don't train a linear function. I train a perceptron to distinguish two datasets that are linearly seperable. – Martin Thoma Jun 14 '14 at 05:59

1 Answers1

2

Neural nets are usually trained for functions enter image description here. This means you cannot directly take (x, f(x)) pairs of a linear function and train it. (However, that can be done with linear regression).

Instead, the net has to be trained with clusters of variables, e.g. something like:

#!/usr/bin/env python

from random import normalvariate

# Build the network
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(2, 1, 1, bias=True)

# Add samples
from pybrain.datasets import SupervisedDataSet
ds = SupervisedDataSet(2, 1)
for i in range(100):
    x = normalvariate(3, 0.6)
    y = normalvariate(2, 1)
    ds.addSample((x, y), (0,))
for i in range(100):
    x = normalvariate(7, 0.5)
    y = normalvariate(1, 0.1)
    ds.addSample((x, y), (1,))

# Train with samples
from pybrain.supervised.trainers import BackpropTrainer
trainer = BackpropTrainer(net, ds, learningrate=0.1, momentum=0.99)

print("Start training")
print(trainer.train())
a = trainer.trainUntilConvergence(dataset=ds,
                                  maxEpochs=1000,
                                  verbose=True,
                                  continueEpochs=10,
                                  validationProportion=0.1,
                                  outlayer=softmax)

print("Finished training")
print(trainer.train())

# See if it remembers
print("Test function f(x)=4x+1")
for x in range(-10,10):
    for y in range(-10,10):
        print("f(%i, %i) = %i" % (x, y, net.activate((x, y))))
print("f(%i, %i) = %i" % (3, 2, net.activate((3, 2))))
print("f(%i, %i) = %i" % (7, 1, net.activate((7, 1))))

An working example with visualization can be found on my blog.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958