2

I have two pandas.Series objects with equal number of elements (they are predictions and target values) and I need to compute the (R)MSE of these two series.

I can use

targets.sub(predictions).pow(2).mean()

for the MSE but I feel that there is a lot of copying1 involved (first for the subtraction result, then for the exponentiation result). Is there an elegant way that does not involve the two copies?


1 Maybe memory allocation is a better term.

zegkljan
  • 8,051
  • 5
  • 34
  • 49
  • Check the answers to the related question: http://stackoverflow.com/questions/17197492/root-mean-square-error-in-python?rq=1 – EdChum Jan 05 '15 at 16:47
  • @EdChum I checked this question prior to posting this one. The reason I posted a new question is that I'm looking for solution in the pandas library context. But thank you nevertheless. – zegkljan Jan 05 '15 at 16:50

1 Answers1

2

If you are only concerned with overall memory footprint in case the Series are huge, the following might help since it does not require temporary storage for intermediate results. However, it has much worse performance.

sum((t-p) ** 2 for t,p in zip(targets, predictions)), 0.0)/len(targets)
Jichao
  • 81
  • 5
  • Is that necessary to do `sum(..., 0.0)` ? I only do `sum((t-p) ** 2 for t,p in zip(targets, predictions)))/len(targets)`. – Chien Nguyen May 27 '23 at 18:20