8

I'm trying to made function that will calculate mean squared error from y (true values) and y_pred (predicted ones) not using sklearn or other implementations.

I'll try next:

def mserror(y, y_pred):
    i=0
    for i in range (len(y)):
        i+=1
        mse = ((y - y_pred) ** 2).mean(y)   
        return mse

Can you please correct me what I m doing wrong with the calculation and who it can be fixed?

Keithx
  • 2,994
  • 15
  • 42
  • 71
  • 1
    That `i+=1` looks wrong to me as you already have an iterator with `i`. – Divakar Aug 21 '16 at 13:24
  • 1
    You are also returning inside your loop, so you are only performing a single iteration and returning. You are always going to replace the value of `mse` in each iteration. Also, I don't know what you are trying to do with that `i`. You are initializing it to 0, and then incrementing it, but then you are using `i` as your iterator in your `for` loop. Revise your code carefully. – idjaw Aug 21 '16 at 13:26
  • This question is a duplicate of: https://stackoverflow.com/questions/17197492/is-there-a-library-function-for-root-mean-square-error-rmse-in-python – Eric Leschinski Dec 24 '19 at 14:44

4 Answers4

21

You are modifying the index for no reason. A for loop increments it anyways. Also, you are not using the index, for example, you are not using any y[i] - y_pred[i], hence you don't need the loop at all.

Use the arrays

mse = np.mean((y - y_pred)**2)
nbro
  • 15,395
  • 32
  • 113
  • 196
percusse
  • 3,006
  • 1
  • 14
  • 28
2

I would say :

def get_mse(y, y_pred):
d1 = y - y_pred
mse = (1/N)*d1.dot(d1) # N is int(len(y))
return mse

it would only work if y and y_pred are numpy arrays, but you would want them to be numpy arrays as long as you decide not to use other libraries so you can do math operations on it.

numpy dot() function is the dot product of 2 numpy arrays (you can also write np.dot(d1, d1) )

rotem
  • 71
  • 2
-2

firstly, you are using the i repeatedly and increments it but in range it is automatically iterative to next number. So don't use i again. The other thing that you are taking the mean of y but instead of taking mean of this, take the mean of ((y - y_pred) ** 2). I hope, you got the point.

Ramzan Shahid
  • 167
  • 1
  • 4
-2

Here's how to implement MSE in python:

def mse_metric(actual, predicted):
    sum_error = 0.0
    # loop over all values
    for i in range(len(actual)):
        # the error is the sum of (actual - prediction)^2
        prediction_error =  actual[i] - predicted[i]
        sum_error += (prediction_error ** 2)
    # now normalize
    mean_error = sum_error / float(len(actual))
    return (mean_error)
m02ph3u5
  • 3,022
  • 7
  • 38
  • 51
Ramzan Shahid
  • 167
  • 1
  • 4