5

I'm doing a neural network in nolearn, a Theano based library that uses lasagne.

I'm not understanding how do I define my own cost function.

The output layer is only 3 neurons [0, 1, 2] and I want it to be mostly sure when it gives 1 or 2, but otherwise - if it isn't really sure of 1, 2 - to give back simply 0.

So, I came up with a cost function (will need tuning) where the cost is double for 1 and 2 than for 0, but I can't understand how to tell this to the network.

# optimization method:
from lasagne.updates import sgd
update=sgd,
update_learning_rate=0.0001

This is the code for the update, but how to I tell SGD to use my cost function instead of it's own?

EDIT: The full net code is:

def nn_loss(data, x_period, columns, num_epochs, batchsize, l_rate=0.02):
    net1 = NeuralNet(
        layers=[('input', layers.InputLayer),
                ('hidden1', layers.DenseLayer),
                ('output', layers.DenseLayer),
                ],
        # layer parameters:
        batch_iterator_train=BatchIterator(batchsize),
        batch_iterator_test=BatchIterator(batchsize),

        input_shape=(None, int(x_period*columns)),
        hidden1_nonlinearity=lasagne.nonlinearities.rectify,
        hidden1_num_units=100,  # number of units in 'hidden' layer
        output_nonlinearity=lasagne.nonlinearities.sigmoid,
        output_num_units=3,

        # optimization method:
        update=nesterov_momentum,
        update_learning_rate=5*10**(-3),
        update_momentum=0.9,
        on_epoch_finished=[
            EarlyStopping(patience=20),
        ],
        max_epochs=num_epochs,
        verbose=1,

        # Here are the important parameters for multi labels
        regression=True,
        # objective_loss_function=multilabel_objective,
        # custom_score=("validation score", lambda x, y: np.mean(np.abs(x - y)))
        )

    # Train the network
    start_time = time.time()
    net1.fit(data['X_train'], data['y_train'])
}

EDIT Error when using regression=True

Got 99960 testing datasets.
# Neural Network with 18403 learnable parameters

## Layer information

  #  name       size
---  -------  ------
  0  input       180
  1  hidden1     100
  2  output        3

Traceback (most recent call last):
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/theano/compile/function_module.py", line 607, in __call__
    outputs = self.fn()
ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 1, but the output's size on that axis is 3.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train_nolearn_simple.py", line 272, in <module>
    main(**kwargs)
  File "train_nolearn_simple.py", line 239, in main
    nn_loss_fit = nn_loss(data, x_period, columns, num_epochs, batchsize)
  File "train_nolearn_simple.py", line 217, in nn_loss
    net1.fit(data['X_train'], data['y_train'])
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/nolearn/lasagne/base.py", line 416, in fit
    self.train_loop(X, y)
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/nolearn/lasagne/base.py", line 462, in train_loop
    self.train_iter_, Xb, yb)
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/nolearn/lasagne/base.py", line 516, in apply_batch_func
    return func(Xb) if yb is None else func(Xb, yb)
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/theano/compile/function_module.py", line 618, in __call__
    storage_map=self.fn.storage_map)
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/theano/gof/link.py", line 297, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/six.py", line 658, in reraise
    raise value.with_traceback(tb)
  File "/Users/morgado/anaconda/lib/python3.4/site-packages/theano/compile/function_module.py", line 607, in __call__
    outputs = self.fn()
ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 1, but the output's size on that axis is 3.
Apply node that caused the error: GpuElemwise{Sub}[(0, 1)](GpuElemwise{Composite{scalar_sigmoid((i0 + i1))}}[(0, 0)].0, GpuFromHost.0)
Toposort index: 22
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(200, 3), (200, 1)]
Inputs strides: [(3, 1), (1, 0)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuCAReduce{pre=sqr,red=add}{1,1}(GpuElemwise{Sub}[(0, 1)].0), GpuElemwise{Mul}[(0, 0)](GpuElemwise{Sub}[(0, 1)].0, GpuElemwise{Composite{scalar_sigmoid((i0 + i1))}}[(0, 0)].0, GpuElemwise{sub,no_inplace}.0), GpuElemwise{mul,no_inplace}(CudaNdarrayConstant{[[ 2.]]}, GpuElemwise{Composite{(inv(i0) / i1)},no_inplace}.0, GpuElemwise{Sub}[(0, 1)].0, GpuElemwise{Composite{scalar_sigmoid((i0 + i1))}}[(0, 0)].0, GpuElemwise{sub,no_inplace}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
jbssm
  • 6,861
  • 13
  • 54
  • 81

3 Answers3

6

When you instantiate your neural network, you can pass a custom loss function that you've defined previously:

import theano.tensor as T
import numpy as np
from nolearn.lasagne import NeuralNet
# I'm skipping other inputs for the sake of concision

def multilabel_objective(predictions, targets):
    epsilon = np.float32(1.0e-6)
    one = np.float32(1.0)
    pred = T.clip(predictions, epsilon, one - epsilon)
    return -T.sum(targets * T.log(pred) + (one - targets) * T.log(one - pred), axis=1)

net = NeuralNet(
    # your other parameters here (layers, update, max_epochs...)
    # here are the one you're interested in:
    objective_loss_function=multilabel_objective,
    custom_score=("validation score", lambda x, y: np.mean(np.abs(x - y)))
    )

As you can see, it's also possible to define a custom score (using the keyword custom_score)

P. Camilleri
  • 12,664
  • 7
  • 41
  • 76
  • Thank you, this is what I was looking for. – jbssm Aug 25 '15 at 16:43
  • After implementing this there is an issue with the new loss function only accepting a batch size of 2. I've updated the question with the code. – jbssm Aug 28 '15 at 09:31
  • @jbssm This works fine with batch size of 128 in my code, can you include the error message in your question? Please also include X_train.shape and y_train.shape – P. Camilleri Aug 28 '15 at 09:41
  • Actually the issue is not your code. I removed the `multilabel_objective` and the issue remains. I'm doing something wrong in changing the network from using catalogation (as it was at the beginning) to be using regression and that should be actually the problem now. – jbssm Aug 28 '15 at 09:53
  • @jbssm What are your input and output shapes? Also, why unaccept is the source of the issue is somewhere else? – P. Camilleri Aug 28 '15 at 09:54
  • No, like I said, the source of the issue is not your code, it's that I must be doing something wrong when using regression instead of classification (it worked well with classification). My input `X` is a 180 vector, my output `y` is 1 size vector filled with the numbers 0 or 1. – jbssm Aug 28 '15 at 10:01
  • Because I cannot check it's validity while I don't get the program working. – jbssm Aug 28 '15 at 11:35
  • Ok, I thought it worked the first time you accepted it. – P. Camilleri Aug 28 '15 at 11:41
2

See the following example (taken from here) that specifies its own loss function:

import lasagne
import theano.tensor as T
import theano
from lasagne.nonlinearities import softmax
from lasagne.layers import InputLayer, DenseLayer, get_output
from lasagne.updates import sgd, apply_momentum
l_in = InputLayer((100, 20))
l1 = DenseLayer(l_in, num_units=3, nonlinearity=softmax)
x = T.matrix('x')  # shp: num_batch x num_features
y = T.ivector('y') # shp: num_batch
l_out = get_output(l1, x)
params = lasagne.layers.get_all_params(l1)
loss = T.mean(T.nnet.categorical_crossentropy(l_out, y))
updates_sgd = sgd(loss, params, learning_rate=0.0001)
updates = apply_momentum(updates_sgd, params, momentum=0.9)
train_function = theano.function([x, y], updates=updates)

Coincidentally, this code also has three units in the output layer.

user650654
  • 5,630
  • 3
  • 41
  • 44
1

I used a custom loss function in a classification task and thought i'd share that with you too. I basically wanted different emphasis on training data depending on the label.

import lasagne
import theano.tensor as T
import theano

def weighted_crossentropy(predictions, targets):

  weights_per_label = theano.shared(lasagne.utils.floatX([0.2, 0.4, 0.4]))
  weights = weights_per_label[targets]  #returns a targets-shaped weight matrix
  loss = lasagne.objectives.aggregate(T.nnet.categorical_crossentropy(predictions, targets), weights=weights)
  return loss

net = NeuralNet(
    # layers and parameters
    objective_loss_function=weighted_crossentropy,
    # ...
    )

This is where I found how to implement it.

bsaendig
  • 11
  • 2
  • http://stackoverflow.com/questions/39412051/how-to-penalize-predictions-binary-cross-entropy-and-conv-nets , Thanks. I think this will help me solve it. – KenobiBastila Sep 09 '16 at 14:51