0

I'm trying to implement a vectorized version of the regularised logistic regression. I have found a post that explains the regularised version but I don't understand it.

To make it easy I will copy the code below:

hx = sigmoid(X * theta);
m = length(X);
J = (sum(-y' * log(hx) - (1 - y') * log(1 - hx)) / m) + lambda * sum(theta(2:end).^2) / (2*m);
grad =((hx - y)' * X / m)' + lambda .* theta .* [0; ones(length(theta)-1, 1)] ./ m ;

I understand the first part of the Cost equation, If I'm correct it could be represented as:

J = ((-y' * log(hx)) - ((1-y)' * log(1-hx)))/m; 

The problem it's the regularization term. Let's take more detail:

Dimensions:

X = (m x (n+1))
theta = ((n+1) x 1)

I don't understand why he let the first term of theta (theta_0) outside of the equation, when in theory the regularized term it's:

and it has to take into account all the thetas

For the gradient descent, I think that this equation it's equivalent:

L = eye(length(theta));
L(1,1) = 0;

grad = (1/m * X'* (hx - y)+ (lambda*(L*theta)/m).
Adriaan
  • 17,741
  • 7
  • 42
  • 75
  • Hi and welcome to Stack Overflow! If I have understood your question correctly, you'd like an explanation on an answer, right? If you reach 50 reputation points, you can [comment everywhere](https://stackoverflow.com/help/privileges/comment) including on that author's post. Your have trouble understanding why the `theta_0` term is handled separately, isn't it? – Adriaan Sep 23 '20 at 14:34
  • Hi, thank you! Yes, you are right, i don't know why it let it outside... But I've also probed to implement the code in matlab (the code that i propose and it doesn't work... @Adriaan – Alex Abades Grimes Sep 23 '20 at 14:44
  • For what it's worth, I've left a comment under the original post to let the answerer know of this question. Hopefully they will see it and respond. – TylerH Sep 23 '20 at 15:19

1 Answers1

1

In Matlab indexes begin from 1, and in mathematic indexes begin from 0 (the indexes on the formula which you mentioned are also beginning from 0).

So, in theory, the first term of theta also needs to be let outside of the equation.

And as for your second question, you right! It is an equivalent clean equation!

Avi
  • 362
  • 1
  • 3
  • 11