Using SGD without using sklearn (LogLoss increasing with every epoch)

Question

def train(X_train,y_train,X_test,y_test,epochs,alpha,eta0):
    w,b = initialize_weights(X_train[0])
    loss_test=[]
    N=len(X_train)
    for i in range(0,epochs):
        print(i)
        for j in range(N-1):
            grad_dw=gradient_dw(X_train[j],y_train[j],w,b,alpha,N)
            grad_db=gradient_db(X_train[j],y_train[j],w,b)
            w=np.array(w)+(alpha*(np.array(grad_dw)))
            b=b+(alpha*(grad_db))                
               predict2 = []
    for m in range(len(y_test)):
        z=np.dot(w[0],X_test[m])+b
        if sigmoid(z) == 0: # sigmoid(w,x,b) returns 1/(1+exp(-(dot(x,w)+b)))
            predict2.append(0.000001)
        elif sigmoid(z) == 1:
            predict2.append(0.99999)
        else:
            predict2.append(sigmoid(z)) 
            
    loss_test.append(logloss(y_test,predict2))       
    return w,b,loss_test

my gradient dw function

def gradient_dw(x,y,w,b,alpha,N):
    dw=[]
    for i in range(len(x)):
        dw.append((x[i]*(y-1/(1+np.exp(abs(w.T[0][i]*x[i]+b)))))+(alpha/N)*(w.T[0][i]))
    return dw

My gradient db function:

 def gradient_db(x,y,w,b):
        db=0
        for i in range(len(x)):
            db=(y-1/(1+np.exp(abs(w.T[0][i]*x[i]+b))))
        return db

My loss function:

def logloss(y_true,y_pred):
    loss=0
    for i in range(len(y_true)):
        loss+=((y_true[i]*math.log10(y_pred[i]))+((1-y_true[i])*math.log10(1-y_pred[i])))
    loss=-1*(1/len(y_true))*loss
    return loss

My problem is after every epoch my loss is increasing. Why?

Any Help will be appreciated

Thank you

I haven't checked your formulas, but I noticed that you add the gradient in each timestep. The update rule for GD is `w_{t+1} = w_t - \gamma \nabla \mathcal{L}(w_t)`. The gradient points per definition in the upwards direction of your function. Therefore you need to go in the opposite direction of the gradient to move in the direction of the minimum. — Tinu, Jul 21 '20 at 07:25
@Tinu how can i do that ?my w is of size 15.and train is of size 5000.how to do this equation.w_{t+1} = w_t - \gamma \nabla \mathcal{L}(w_t) .What is t and its length.I am exactly not understanding this part — raju, Jul 21 '20 at 07:29
All you need to do in your `train` function is to replace the `+` in `w=np.array(w)+(alpha*(np.array(grad_dw)))` and `b=b+(alpha*(grad_db))` with a `-`. If that doesn't work check your formulas, derivations and your code again. — Tinu, Jul 21 '20 at 07:34

score 0 · Accepted Answer · answered Jul 24 '20 at 10:54

0

The problem was of weight function

i was taking weight array as of dim(15,1)

but it should be (15)

So all the changes need to be done according with it in this code

Thank You

answered Jul 24 '20 at 10:54

raju

119
9

Using SGD without using sklearn (LogLoss increasing with every epoch)

1 Answers1

Linked