overflow of square funcion during gradient descent calcultion

Question

i written the linear regression ( in one variable) along with gradient descent, it is working fine for smaller dataset, but for larger data set, it is giving error as:

OverflowError: (34, 'Numerical result out of range')

the code directing error in following part :

def gradient_des ( theta0, theta1, x, y):
    result = 0;
    sumed = 0;
    if len(x) == len(y):
        for i in range(len(x)):
            sumed = sumed + ( line(theta0,theta1,x[i]) - y[i])**2 #error shown in this line.
        result = sumed / (2 * len(x))
        return result
    else:
        printf("x and y are of inequal length")

# in general cases for x and y, which were generated for testing purposes below
x = []
for i in range(10):
    x = x + [i]
print(x)
#x = [1,2,3,4,5,6]
y = [ 0 for _  in range(len(x))]
for i in range(len(y)):
    y[i] = random.randint(-100,100)
print(y)
# y = [13,10,8.75,4,5.5,2]

why is this overflow occuring,

after that in code, changing the learning factor ( i.e. alpha,) sometimes code run for alpha =0.1 but not for alpha = 1 [ for smaller known dataset ]

def linear_reg (x,y):
    if len(x) == len(y):
        theta0 = random.randint(-10,10)
        theta1 = random.randint(-10,10)
        alpha = 0.1 # problem in how to decide the the factor to be smal or large

        while gradient_des(theta0,theta1,x,y) != 0 : # probably error in this converging condition
            temp0 = theta0 - alpha * summed_lin(theta0,theta1,x,y)
            temp1 = theta1 - alpha * summed_lin_weighted(theta0,theta1,x,y)
            # print(temp0)
            # print(temp1)
            if theta0 != temp0 and theta1 != temp1:
                theta0 = temp0
                theta1 = temp1
            else:
                break;
        return [theta0,theta1]
    else:
        printf("x and y are of inequal length")

for value of alpha = 1, it gives same error as above shouldn't the regression be independent of alpha,( for smaller values )

the full code is here : https://github.com/Transwert/General_purposes/blob/master/linreg.py

https://stackoverflow.com/questions/12666600/overflowerror-numerical-result-out-of-range-when-generating-fibonacci-numbers Check this — Siva Shanmugam, May 24 '19 at 09:17
You will very rarely **ever** get the error to get to exactly zero. Combined with a probable large learning rate `alpha`, this is why your code doesn't converge and hence your variables are overflowing. You should cap this at a maximum number of iterations or when the error between successive iterations is less than some threshold. Also, setting `alpha` to be smaller helps mitigate the overflow but you don't have a proper exit condition in your gradient descent algorithm for this to work properly. — rayryeng, May 24 '19 at 09:20
@SivaShanmugam That post doesn't help. The problem is the gradient descent algorithm itself. The overflow in data type is simply a by-product of an incorrectly written algorithm. — rayryeng, May 24 '19 at 09:24
@rayryeng it means if i can use a termination condition like gradient descent value going lower than e^-16 or something like that, it should terminate return the vaue — Transwert, May 24 '19 at 09:34
@rayryeng can you suggest changes that should be made, to eradicate that runtime problem — Transwert, May 25 '19 at 08:43
Already told you. However it would be nice to show us what dataset you're using for the code to overflow. I can perhaps write an answer once I verify that it works with my changes. — rayryeng, May 25 '19 at 19:57
@rayryeng apologies for late replies, like i used a .csv file and using the code for linear regression, i got corect linear model, but the model seems to be not working while generating random datasets for x and y. the data set and code is : https://github.com/Transwert/General_purposes with code - linearreg.py and data as testfile.csv — Transwert, Jun 04 '19 at 11:34

overflow of square funcion during gradient descent calcultion

0 Answers0