2

I have a bug in my theano program leading to NaN values. The doc recommends using nanguardmode to track down the source of the problem.

When I copy/paste this line from the doc webpage:

from theano.compile.nanguardmode import NanGuardMode

I get:

ImportError: No module named nanguardmode

Can't find any sign of nanguardmode when I type:

help(theano.compile)

Any idea why nanguardmode is absent? How can I fix this?

EDIT:

Thanks for your replies.

Concerning my Theano version, I'm couldn't find how to check it. But I assume it is the latest: I installed it form the install webpage about about a month ago. I'm on Windows 64bit.

Concerning detect_nan hack: things just get weirder!

First: if I try to use:

post_func=theano.compile.monitormode.detect_nan

I get:

File "C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.10.amd64\lib\site-packages\theano\compile\monitormode.py", line 87, in detect_nan
if (not isinstance(numpy.random.RandomState, output[0]) and

NameError: global name 'numpy' is not defined

Indeed, numpy was not imported in the monitormode module... Is that a known bug?

Second: if I try to use a copy/paste of detect_nan, the NaNs magically go away. Everything else remaining the same, without detect_nan in my theano function (that trains a model iteratively), I get NaNs at iteration 5:

epoch 1, valid 28.582677 %, train 27.723320 % 0.546633
epoch 2, valid 27.814961 %, train 25.681751 % 0.500522
epoch 3, valid 27.263780 %, train 24.262972 % 0.478799
epoch 4, valid 26.938976 %, train 23.209021 % 0.463017
epoch 5, valid 50.000000 %, train 50.000000 % nan

(the last figure is the cost value)

When I do add

mode=theano.compile.MonitorMode(post_func=detect_nan)

to the function, no NaNs appear up to at least iteration 100 (and probably more).

epoch 1, valid 28.582677 %, train 27.723320 % 0.546633
epoch 2, valid 27.814961 %, train 25.681751 % 0.500522
epoch 3, valid 27.263780 %, train 24.262972 % 0.478799
epoch 4, valid 26.938976 %, train 23.209021 % 0.463017
epoch 5, valid 26.289370 %, train 22.320902 % 0.450454
... etc ...

What's going on here???

Julien
  • 13,986
  • 5
  • 29
  • 53
  • What is your theano version? – P. Camilleri Sep 15 '15 at 12:36
  • dunno what happened but after restarting the console, everything is back in shape... – Julien Sep 16 '15 at 05:41
  • So you no longer need answers for the updated portion of your question? If so, perhaps you could add your own answer with what you found helped you? – Daniel Renshaw Sep 16 '15 at 06:40
  • Nothing really 'helped me' beside restarting the console. The bug was 'real' in a sense that it was reproducible before restarting the console. But after restarting, it seems gone for good. I assume the computer was in some strange internal state... – Julien Sep 16 '15 at 06:55
  • I guess the missing import numpy in monitormode.py should still be reported, but I don't know where / to whom... – Julien Sep 16 '15 at 07:03
  • To check a package version you can use `pip show yourPackageName` in command line. Otherwise see http://stackoverflow.com/questions/739993/how-can-i-get-a-list-of-locally-installed-python-modules – P. Camilleri Sep 16 '15 at 08:28

1 Answers1

2

NanGuardMode was moved to Theano's bleeding edge version (from PyLearn2) on May 1st. This was after the release of version 0.7 on March 26th so you'll need to upgrade to the bleeding edge version from GitHub to use NanGuardMode.

Alternatively you could use the detect_nan sample found in the debug FAQ:

import numpy

import theano

# This is the current suggested detect_nan implementation to
# show you how it work.  That way, you can modify it for your
# need.  If you want exactly this method, you can use
# ``theano.compile.monitormode.detect_nan`` that will always
# contain the current suggested version.

def detect_nan(i, node, fn):
    for output in fn.outputs:
        if (not isinstance(output[0], numpy.random.RandomState) and
            numpy.isnan(output[0]).any()):
            print '*** NaN detected ***'
            theano.printing.debugprint(node)
            print 'Inputs : %s' % [input[0] for input in fn.inputs]
            print 'Outputs: %s' % [output[0] for output in fn.outputs]
            break

x = theano.tensor.dscalar('x')
f = theano.function([x], [theano.tensor.log(x) * x],
                    mode=theano.compile.MonitorMode(
                        post_func=detect_nan))
Daniel Renshaw
  • 33,729
  • 8
  • 75
  • 94