Saving neural network testing outputs in Pybrain

Question

I made a supervised neural network with pybrain, it works great and when I test it with "trainer.testOnData(test_data, verbose=True)" I can see the output (and the error) but I would also like to save it for further analysis. I coudn't find how on pybrain documentation. Does anyone that has worked with pybrain know how I can do it? Thank you (I hope this is not an obvious thing).

Pleas include python tag to your answer, it will trigger syntax highlight in whole thread. — Pawel Wisniewski, Sep 01 '14 at 16:19
Is your question similar to this one? http://stackoverflow.com/questions/6006187/how-to-save-and-recover-pybrain-traning — rossdavidh, Sep 01 '14 at 23:28
@rossdavich - no, I want to be able to manipulate the network's output, and in that question he wants to save the entire trained network so it can be latter used again — Bruno Penha, Sep 02 '14 at 17:14

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

I have same problem as you, and to quickly answer question: no there is no straight forward way to do it.
But it is of course doable.

Mess with pybrain code

That seem like easiest solution, here you have source code of BackpropTrainer.testOnData. As you can see, it prints all errors if verbose is set to True.

    if verbose:
        print('All errors:', ponderatedErrors)
    assert sum(importances) > 0
    avgErr = sum(errors) / sum(importances)
    if verbose:
        print('Average error:', avgErr)
        print(('Max error:', max(ponderatedErrors), 'Median error:',
               sorted(ponderatedErrors)[len(errors) / 2]))
    return avgErr

We could make it return all errors along avgErr by changing last line to:

return avgErr, ponderatedErrors

Then you catch values simply unpacking result:

avgErr, allErrors = trainer.testOnData(dataSet, verbose=True)

or when you don't want all errors:

avgErr, _ = trainer.testOnData(dataSet, verbose=True)

That's simplest solution. But no everyone like to mess with external libraries source code.

Change stdout, catch it to a file and transform it

This is few step procedure, because testOnData never returns all errors, just prints it, it means that you have to transform string into something useful (lets try with list).

Change `stdout` to print into file

That's easy:

import sys
sys.stdout = open('./OURFILE', 'w+')

So now when we run testOnData output is save in file.

Work that string

We are intrested in second line of our file, so lets just get it:

our_file = open('./OURFILE', 'r')
our_file.next()                      # get rid of first line
our_line = our_file.next()           # save second line

Because how pybrain is written our line looks like this:

('All errors:', HERE_IS_LIST_OF_ERRORS)

Now, I'm not regex wizard so I'll just count when list starts.

still_string = our_line[16:-1]

It will give us string that includes only a list. And by now you cane use eval to change sting into proper list:

list_of_errors = eval(still_string)

From here, you cane use numpy or pandas to play with it.

I hope that helped.

Thank you so much Pawel, coudn't have asked for a better answer: got my problem solved! — Bruno Penha, Sep 02 '14 at 17:06
I'm glad I could help. Please accept the answer by clicking grey check mark next to it, also upvote is nice form of gratitude ;) — Pawel Wisniewski, Sep 02 '14 at 17:10
Sorry, my first time here. I still lack enough reputation to upvote, but I will once I do ;) — Bruno Penha, Sep 02 '14 at 17:17

Deviacium · Answer 2 · 2015-07-23T22:12:50.493

I might be a little late to the party, but just found your question while searching for directions to the network result and ground truth inside dataset testing.

So it's simply not there, but for statistical analysis and visualization purpose it should be. So let`s make it!

But we have no need to mess with independent library code. You might break something in the third party lib and your code becomes totally unportable (unless you specify specific directions where to apply the patch, but ugh.. you really shouldn't). There is a nice and very pythonic solution - OOP power.

Just discover the code of needed function with

import inspect
print inspect.getsource(BackpropTrainer.testOnData)

Just copy that code and prepare to use all the might of the OOP on your problem. Implement a custom class (you can store it in separate module and import it or implement it inline with your code) and see it inherits from original (in this case - BackpropTrainer) class and paste the function you got from step 1 (remember to change the function name to something unconflicting with existing ones).

class myOwn_BackpropTrainer(BackpropTrainer):
    def myOwn_testOnData(self, dataset=None, verbose=False):
        """Compute the MSE of the module performance on the given dataset.
        If no dataset is supplied, the one passed upon Trainer initialization is
        used."""
        if dataset == None:
            dataset = self.ds
        dataset.reset()
        if verbose:
            print '\nTesting on data:'
        errors = []
        importances = []
        ponderatedErrors = []
        gt_values = []
        for seq in dataset._provideSequences():
            self.module.reset()
            e, i = dataset._evaluateSequence(self.module.activate, seq, verbose)
            importances.append(i)


            for input, target in seq:
                gt_values.append([self.module.activate(input), target])


            errors.append(e)
            ponderatedErrors.append(e / i)
        if verbose:
            print 'All errors:', ponderatedErrors
        assert sum(importances) > 0
        avgErr = sum(errors) / sum(importances)
        if verbose:
            print 'Average error:', avgErr
            print ('Max error:', max(ponderatedErrors), 'Median error:',
               sorted(ponderatedErrors)[len(errors) / 2])
        return gt_values, avgErr

Notice the lines separated by double spaces and the declaration of my own gt_values variable and the changes in the return statement. Now i can simply replace the BackpropTrainer class instance with our class instance and call our new function:

load_dataset(ds)
trainer = t3_BackpropTrainer(net, ds, learningrate = 0.04, momentum=0.7, weightdecay=0.02, verbose=True)
result, _ = trainer.t3_testOnData(verbose = True)

The result variable now stores an array with network result and ground truth ready for use in visualization or stats collection.

This way you keep all the customization in your code and don't mess with original third-party library code. You can now easily share your code with others, update the library without worries that your patch will go away and avoid many more troubles.

Saving neural network testing outputs in Pybrain

2 Answers2

Mess with pybrain code

Change stdout, catch it to a file and transform it

Change stdout to print into file

Work that string

Change `stdout` to print into file