3
def train(X_train,y_train,X_test,y_test,epochs,alpha,eta0):
    w,b = initialize_weights(X_train[0])
    loss_test=[]
    N=len(X_train)
    for i in range(0,epochs):
        print(i)
        for j in range(N-1):
            grad_dw=gradient_dw(X_train[j],y_train[j],w,b,alpha,N)
            grad_db=gradient_db(X_train[j],y_train[j],w,b)
            w=np.array(w)+(alpha*(np.array(grad_dw)))
            b=b+(alpha*(grad_db))                
               predict2 = []
    for m in range(len(y_test)):
        z=np.dot(w[0],X_test[m])+b
        if sigmoid(z) == 0: # sigmoid(w,x,b) returns 1/(1+exp(-(dot(x,w)+b)))
            predict2.append(0.000001)
        elif sigmoid(z) == 1:
            predict2.append(0.99999)
        else:
            predict2.append(sigmoid(z)) 
            
    loss_test.append(logloss(y_test,predict2))       
    return w,b,loss_test

my gradient dw function

def gradient_dw(x,y,w,b,alpha,N):
    dw=[]
    for i in range(len(x)):
        dw.append((x[i]*(y-1/(1+np.exp(abs(w.T[0][i]*x[i]+b)))))+(alpha/N)*(w.T[0][i]))
    return dw

My gradient db function:

 def gradient_db(x,y,w,b):
        db=0
        for i in range(len(x)):
            db=(y-1/(1+np.exp(abs(w.T[0][i]*x[i]+b))))
        return db

My loss function:

def logloss(y_true,y_pred):
    loss=0
    for i in range(len(y_true)):
        loss+=((y_true[i]*math.log10(y_pred[i]))+((1-y_true[i])*math.log10(1-y_pred[i])))
    loss=-1*(1/len(y_true))*loss
    return loss

My problem is after every epoch my loss is increasing. Why?

Any Help will be appreciated

Thank you

desertnaut
  • 57,590
  • 26
  • 140
  • 166
raju
  • 119
  • 9
  • 1
    I haven't checked your formulas, but I noticed that you add the gradient in each timestep. The update rule for GD is `w_{t+1} = w_t - \gamma \nabla \mathcal{L}(w_t)`. The gradient points per definition in the upwards direction of your function. Therefore you need to go in the opposite direction of the gradient to move in the direction of the minimum. – Tinu Jul 21 '20 at 07:25
  • @Tinu how can i do that ?my w is of size 15.and train is of size 5000.how to do this equation.w_{t+1} = w_t - \gamma \nabla \mathcal{L}(w_t) .What is t and its length.I am exactly not understanding this part – raju Jul 21 '20 at 07:29
  • All you need to do in your `train` function is to replace the `+` in `w=np.array(w)+(alpha*(np.array(grad_dw)))` and `b=b+(alpha*(grad_db))` with a `-`. If that doesn't work check your formulas, derivations and your code again. – Tinu Jul 21 '20 at 07:34
  • It increased even more @Tinu – raju Jul 21 '20 at 07:48
  • No Answers yet. – raju Jul 21 '20 at 16:59

1 Answers1

0
  1. The problem was of weight function

  2. i was taking weight array as of dim(15,1)

  3. but it should be (15)

  4. So all the changes need to be done according with it in this code

  5. Thank You

raju
  • 119
  • 9