9

I keep getting a few NaN outputs from my neural network using Keras. I only get about one NaN per 10,000 results. Originally I had a relu activation layer feeding into the final softmax layer. This produced more NaN results. I changed the activation function for the last two dense layers in the constitutional network from relu to sigmoid. This made the problem better but I still get NaN. Any advice for how I can completely eliminate Nan?

model = Sequential()
model.add(InputLayer((1, IMG_H, IMG_W)))

model.add(Convolution2D(32, 3, 3, activation = 'relu'))
model.add(Convolution2D(32, 3, 3, activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Dropout(0.3))

model.add(Convolution2D(64, 3, 3, activation = 'relu'))
model.add(Convolution2D(64, 3, 3, activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Dropout(0.3))

model.add(Flatten())
model.add(Dense(256, activation = 'sigmoid'))
model.add(Dropout(0.3))
model.add(Dense(64, activation = 'sigmoid'))
model.add(Dropout(0.3))
model.add(Dense(categories, activation = 'softmax'))
chasep255
  • 11,745
  • 8
  • 58
  • 115
  • 5
    NaNs in your output/losses are always a very bad sign. Did you preprocess / normalize your input? Is your learning-rate small enough? NaNs should never occur if data is preprocessed correctly. [This](http://cs231n.github.io/neural-networks-2/) might help. – sascha May 10 '16 at 16:52
  • I normalized my input between 1 and 0. I used a small learning rate of between 0.01 and 0.001. Right now I am adding weight regularization to see if that helps. – chasep255 May 10 '16 at 16:53
  • Normalizing between 0-1 is not necessarily what you want. This sounds like MinMax-scaler in sklearn. You want normalizing the mean and variances aka StandardScaler in sklearn. This is very important when using SGD-based algorithms! In your case the mean will be >> 0.0 and i think you didn't change the variance. – sascha May 10 '16 at 16:54
  • 7
    I dug through my dataset of 80k images after I realized the NaNs were always coming from the same images. Turns out those images were solid black which is an error on part of the data provider. When I tried to normalize them this made it blow up. – chasep255 May 11 '16 at 00:06
  • 1
    Looks like you've solved the problem, but for future reference there is a [nice question](https://stackoverflow.com/a/33980220/1397061) which among other things recommends checking whether you have nan inputs. – 1'' May 15 '16 at 21:59
  • 1
    If you post an answer to your question and accept it, it will stop showing up as unanswered in search. – Viktoriya Malyasova Jul 16 '20 at 18:25

0 Answers0