2

I tried to use DBN function imported from nolearn package, and here is my code:

from nolearn.dbn import DBN
import numpy as np
from sklearn import cross_validation

fileName = 'data.csv'
fileName_1 = 'label.csv'

data = np.genfromtxt(fileName, dtype=float, delimiter = ',')
label = np.genfromtxt(fileName_1, dtype=int, delimiter = ',')

clf = DBN(
    [data, 300, 10],
    learn_rates=0.3,
    learn_rate_decays=0.9,
    epochs=10,
    verbose=1,
    )

clf.fit(data,label)
score = cross_validation.cross_val_score(clf, data, label,scoring='f1', cv=10)
print score

Since my data has the shape(1231, 229) and label with the shape(1231,13), the label sets looks like ([0 0 1 0 1 0 1 0 0 0 1 1 0] ...,[....]), when I ran my code, I got the this error message: bad input shape (1231,13). I wonder two problem might happened here:

  1. DBN does not support multi-label classification
  2. my label is not suitable to be used in DBN fit function.
Michael Currie
  • 13,721
  • 9
  • 42
  • 58
Kun
  • 581
  • 1
  • 5
  • 27

2 Answers2

5

As mentioned by Francisco Vargas, nolearn.dbn is deprecated and you should use nolearn.lasagne instead (if you can).

If you want to do multi-label classification in lasagne, then you should set your regression parameter to True, define a validation score and a custom loss.

Here's an example:

import numpy as np
import theano.tensor as T
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from nolearn.lasagne import BatchIterator
from lasagne import nonlinearities

# custom loss: multi label cross entropy
def multilabel_objective(predictions, targets):
    epsilon = np.float32(1.0e-6)
    one = np.float32(1.0)
    pred = T.clip(predictions, epsilon, one - epsilon)
    return -T.sum(targets * T.log(pred) + (one - targets) * T.log(one - pred), axis=1)


net = NeuralNet(
    # customize "layers" to represent the architecture you want
    # here I took a dummy architecture
    layers=[(layers.InputLayer, {"name": 'input', 'shape': (None, 1, 229, 1)}),

            (layers.DenseLayer, {"name": 'hidden1', 'num_units': 20}),
            (layers.DenseLayer, {"name": 'output', 'nonlinearity': nonlinearities.sigmoid, 'num_units': 13})], #because you have 13 outputs

    # optimization method:
    update=nesterov_momentum,
    update_learning_rate=5*10**(-3),
    update_momentum=0.9,

    max_epochs=500,  # we want to train this many epochs
    verbose=1,

    #Here are the important parameters for multi labels
    regression=True,  

    objective_loss_function=multilabel_objective,
    custom_score=("validation score", lambda x, y: np.mean(np.abs(x - y)))

    )

net.fit(X_train, labels_train)
P. Camilleri
  • 12,664
  • 7
  • 41
  • 76
  • Thank you Massias. When I was trying to test your example, I was stuck on the import error: cannot import name mse. I searched this error online, many guys said there was a problem on Lasagne and nolearn that are not compatible. I am using the no learn 0.5. – Kun Aug 25 '15 at 14:27
  • @Fox I ran into this problem too, sometimes versions of theano and lasagne do not agree. Run the following lines from command line successively : first `pip install -r https://raw.githubusercontent.com/Lasagne/Lasagne/master/requirements.txt` and then `pip install https://github.com/Lasagne/Lasagne/archive/master.zip`; after that it should work – P. Camilleri Aug 25 '15 at 14:33
  • I followed your instruction and still have this problem. I checked the requirement, is there a problem with my Theano version? Mine is 0.7.0. – Kun Aug 25 '15 at 14:47
  • @Fox I have theano 0.7.0, nolearn 0.6a0.dev0 and lasagne 0.1. Same for you? You can check with pip show – P. Camilleri Aug 25 '15 at 14:53
  • I have theano 0.7.0, nolearn 0.5 and lasagne 0.2.dev1, look like both of nolearn and lasagne are not same as yours. – Kun Aug 25 '15 at 14:55
  • You can try and install these versions, then (I reinstalled it this morning, cannot understand why we get different versions...) – P. Camilleri Aug 25 '15 at 14:58
  • I have reinstalled the lasagne with 0.1 version and have the same problem, no I am trying to reinstall nolearn, by the way, I am using python2.7, which python version are you using? – Kun Aug 25 '15 at 15:03
  • I am using python 2.7.10 – P. Camilleri Aug 25 '15 at 15:14
  • Our difference is the nolearn version, could show me how install the nolearn 0.6a0.dev0? I could not find the release version – Kun Aug 25 '15 at 15:20
  • Thanks, but I got this error: Could not find a version that satisfies the requirement nolearn==0.6a0.dev0 (from versions: 0.1b1, 0.2b1, 0.2, 0.3, 0.3.1, 0.4, 0.5b1, 0.5) No matching distribution found for nolearn==0.6a0.dev0 – Kun Aug 25 '15 at 15:23
  • Probably it's not availaible through pip, try `pip install -r https://raw.githubusercontent.com/dnouri/nolearn/master/requirements.txt https://github.com/dnouri/nolearn/archive/master.zip#egg=nolearn` – P. Camilleri Aug 25 '15 at 15:25
  • Thank you, I tried this before, I got the 0.5 also. I would like to use another machine and try it again. May I ask you question later if there is any problem about the code? – Kun Aug 25 '15 at 15:45
  • Yes (the lines are different from what I suggested a few comments ago, right?). You can also read about virtualenvs (install one on your current machine) if changing machine is difficult. – P. Camilleri Aug 25 '15 at 15:48
  • I uninstalled nolearn and lasagne, now there is no error. But I got this error:('Bad input argument to theano function with name, 'Wrong number of dimensions: expected 4, got 2 with shape (128, 229).'). My label is array with shape (247, 13), data is (247,229), I have no idea where "128" comes from. – Kun Aug 25 '15 at 16:33
  • 128 is the size of the batch. Do `data = data[None, None, :, :]` as I suggested in my answer here: http://stackoverflow.com/questions/31499761/get-output-from-lasagne-python-deep-neural-network-framework – P. Camilleri Aug 25 '15 at 16:41
  • Yes, also I changed label = label[None, None, :, :], but unfortunately, I got another error again: "Cannot have number of folds n_folds=5 greater than the number of samples: 1." – Kun Aug 25 '15 at 16:50
  • This, I think, comes from `cross_validation_score`, not the nolearn part. – P. Camilleri Aug 25 '15 at 17:05
  • My main guess is that, since `clf` does not come from sklearn but from nolearn.lasagne, it's not compatible with the cross val function you're trying to use. `clf` expect a 4D input, but in sklearn usually it's 2D : (n_samples, n_features) – P. Camilleri Aug 25 '15 at 17:12
  • I was thinking of it, but if I commented out the validation part, just leaving fit(data,label), the same error exists.Do you think if there is a issue in fit process? – Kun Aug 25 '15 at 17:45
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/87913/discussion-between-m-massias-and-fox). – P. Camilleri Aug 25 '15 at 17:58
0

Fit calls BuildDBN which can be found here here an important thing to note is that dbn has been deprecated and you can only find it old_commits. Anyways if you are looking for extra info its probably good to check those two from what I can see in your snippet is that the first parameter of DBN namely [data, 300, 10] should be [data.shape[1], 300, 10] based on the documentation and the source code. Hope this helps.

Francisco Vargas
  • 653
  • 1
  • 7
  • 22
  • Thanks. Does DBN support multi-label classification task? I wonder the problem also arises from the label. – Kun Aug 24 '15 at 17:52