Gradient Descent Runtime Error

Question

I have made a simple implementation of gradient descent for python and it works fine for most parameters but for certain parameters of learning rate and number of iterations it gives me a run time error.

RuntimeWarning: overflow encountered in double_scalars

RuntimeWarning: invalid value encountered in double_scalars

Now I am assuming because it gets to a point where b and m values become too large to store in memory since there was an overflow error , is this assumption correct?

And how do I prevent program from crashing as exception handling in main program doesn't seem to work and can you think of a way without exception handling to prevent the error logically?

def compute_error(points,b,m):
    error = 0
    for i in range(len(points)):
        y = ponts[i][1]
        x = points[i][0]
        error +=  (y - (m*x + b))**2
    return error/len(points)

def gradient_runner(points,LR,num_iter,startB=0,startM=0):
    b = startB
    m = startM
    for i in range(num_iter):
        b,m = step_gradient(points,b,m,LR)
    return [b,m]

def step_gradient(points,b,m,LR):
    b_gradient = 0
    m_gradient = 0
    N = float(len(points))
    for i in range(len(points)):
        x = points[i][0]
        y = points[i][1]
        b_gradient+= (-2/N)*(y - ((m*x)+b))
        m_gradient+= (-2/N)*x*(y - ((m*x)+b))
##    print "Value for b_gradient",b_gradient
##    print "Value for b is ",b
##    print "Value for learning rate is ",LR
    new_b = b - (LR * b_gradient)
    new_m = m - (LR * m_gradient)
    return [new_b,new_m]    

import numpy as np
a = np.array([[1,1],[4,2],[6,3],[8,4],[11,5],[12,6],[13,7],[16,8]])

b,m=gradient_runner(a,0.0001,1000) # These parameters work
# b,m=gradient_runner(a,0.1,10000) #Program Crashes
yguesses = [m * i + b for i in a[:,0]]


import matplotlib.pyplot as plt

guezz= yguesses

plt.scatter(a[:,0], a[:,1] ,color="green")
plt.plot(a[:,0],guezz,color="red")

plt.show()

score 1 · Accepted Answer · answered Aug 07 '17 at 15:21

1

The problem is the the 'learning rate' LR (test this by only changing LR -- you will find that if you go low enough, the algorithm converges). With a too high value of LR, you take a too big step every time (imagine you 'jump over' the correct value at every step). There are ways to calculate what the maximum step size should be. Google around a little bit (e.g. "gradient descent step size").

However, as you note, if you get an overflow the result will most likely be erroneous. In Python you can catch warnings, which you could use to tell the user that the result is wrong.

answered Aug 07 '17 at 15:21

pingul

3,351
3
25
43

what options do we have if the learning rate has now become too low to avoid the overflow. in my case on setting a learning rate of 0.0001 makes the error go away but my gradient descent does not converge even for 2000 iterations. – otaku Aug 26 '18 at 04:26
1

@AbhyudayaSrinet Assuming your algorithm is correct, it might be the fact that your problem is not suitable for a plain gradient descent. You can try googling for stuff like "improving conditioning gradient descent". – pingul Aug 27 '18 at 00:46
Thanks. Turns out my implementation was incorrect. I was doing oneVsAll classification and using labels as my output values whereas I should've been using a class label vector (eg. [0,1,0,0]). This [answer](https://stackoverflow.com/a/40229964/1783688) helped me figure it out. – otaku Aug 27 '18 at 06:49

Gradient Descent Runtime Error

1 Answers1