As part of a project I'm working on, I need to calculate the mean squared error between 2m
vectors.
Basically I two matrices x
and xhat
, both are size m
by n
and the vectors I'm interested in are the rows of these vectors.
I calculate the MSE with this code
def cost(x, xhat): #mean squared error between x the data and xhat the output of the machine
return (1.0/(2 * m)) * np.trace(np.dot(x-xhat,(x-xhat).T))
It's working correctly, this formula is correct.
The problem is that in my specific case, my m
and n
are very large. specifically, m = 60000
and n = 785
. So when I run my code and it enters this function, I get a memory error.
Is there a better way to calculate the MSE? I'd rather avoid for loops and I lean heavily towards matrix multiplication, but matrix multiplication seems extremely wasteful here. Maybe something in numpy I'm not aware of?