1

I read a lot on stack Overflow but i still can't understand how avoiding the overflow error. I'am building a Neural network which use the sigmoid function. But I cant go on without converting or finding a workaround for these errors.

def activation(x):
    return  1/(1+np.exp(-x))

  def dactivation(x):
    return  activation(x)*(1-activation(x))


  def propagateb(self, target, lrate=8.1, momentum=0.1):
        deltas = []
        error = target - self.layers[-1]
        delta = error*dactivation(self.layers[-1])
        deltas.append(delta)
        for i in range(len(self.shape)-2,0,-1):
            delta =np.dot(deltas[0],self.weights[i].T)*dactivation(self.layers[i])
            deltas.insert(0,delta)
        for i in range(len(self.weights)):
            layer = np.atleast_2d(self.layers[i])
            delta = np.atleast_2d(deltas[i])
            dw = np.dot(layer.T,delta)
            self.weights[i] += lrate*dw + momentum*self.dw[i]
            self.dw[i] = dw

        # Return error
        return (error**2).sum()

raise

ann.py:5: RuntimeWarning: overflow encountered in exp
  return  1/(1+np.exp(-x))
Muhammad Ilyas
  • 169
  • 1
  • 13
Boat
  • 509
  • 3
  • 8
  • 21

4 Answers4

5

SciPy comes with a function to do that, which won't give you that warning:

scipy.special.expit(x)
user2357112
  • 260,549
  • 28
  • 431
  • 505
2

The idea is that you should avoid to call exp(something) with something being too big. So avoid using exp(x) when x >> 0 and avoid using exp(-x) when x << 0.

In order to achieve that you could start by writing one expression that works with x > 0 and another one that works for x < 0.

  1. With x > 0 you can safely use your expression: 1/(1+exp(-x))
  2. For x < 0 you rewrite that expression by multiplying the numerator and the denominator by exp(x) which gives exp(x) / (1+exp(x)). As you see, no more exp(-x) here.

You can find an expression that works for both cases:

Given x is a matrix, I used np.exp(np.fmin(x, 0)) / (1 + np.exp(-np.abs(x))) in my personal experiments here https://github.com/thirionjl/chains/blob/master/chains/operations/activation_ops.py#L42

1

You have to be careful when you are using numpy integers cause they don't have arbitrary precision as stated here Can Integer Operations Overflow in Python?

For numpy double, that range is (-1.79769313486e+308, 1.79769313486e+308).

Also have a look at this answer which describes it quite well.

Here is more information on numpy dtypes and their allowed range.

Alan Garrido
  • 604
  • 5
  • 16
0

It seems like the passed-in data must be an integer, although this activation function should return a float. I assume the fix is as simple as

return  1./(1.+np.exp(-x))

I would guess that without this change, the code is trying to do integer division, and thereby generating the error.

bremen_matt
  • 6,902
  • 7
  • 42
  • 90