How to find a precent value that represents how much two arrays are different?

Question

I have two arrays.

I want a percent value that describes how much their values are different. I try using MSE and RMSE:

/**
* Mean Squared Error
*    MSE = (1/n) * Ʃ[(r - p)^2]}
*/
export function computeMse(a, b) {
  const size = a.length
  let error = 0
  for (let i = 0; i < size; i++) {
    error += Math.pow(b[i] - a[i], 2)
  }
  return (1 / size) * error
}

/**
* Root Mean Squared Error
*    RMSE = √MSE
*/
export function computeRmse(a, b) {
  return Math.sqrt(computeMse(a, b))
}

and:

const a = [2354493, 2615706, 1594281, 1570894, 1930709, 2086681]
const b = [2354493, 2224360.55, 1906806.9, 1408769.93, 1609053.96, 2200698.72]

const mse = computeMse(a, b)
const rmse = computeRmse(a, b)

The result is:

mse:  65594986451.87959
rmse:  256115.18200192583

I don't think that this result is correct. First of all, mse and rmse are not in range [0, 100], and then are values very large even if the two array are not so different.

What I'm wrong?

I try also:

export function computeMape(actual, forecast) {
  const size = actual.length
  let error = 0
  for (let i = 0; i < size; i++) {
    error += Math.abs((actual[i] - forecast[i]) / actual[i])
  }
  return (100 * error) / size
}

with:

const a = [77, 50, 38, 30, 26, 18] 
const b = [77, 81.13, 92.77, 101.98, 119.76, 121.26]

And I get mape: 230.10116059379217...

Another example:

const a = [1.15, 1.09, 1.08, 0.78, 0.51, 0.44]
const b = [1.15, 1.61, 1.88, 2.13, 2.3, 2.47]

const mape = computeMape(a, b) // result: 184.53357461497413

Suppose you have this three dataset:

The red line rapresents the real data, the dotted green line rapresents the forecast data created by user (test 1) and the dotted gray line rapresents the forecast data created by user (test 2). In fact user can try different times to hit the real data (it's like a game).

Now I would like to return to the user a feedback telling the user how wrong he was to guess the trend of the data in terms of percentage.

The user can make numerous predictions. I would like a percentage number to tell me how much the user was wrong in order to compare each attempt.

Is it possible something like that?

Also in this case, I get NaN result:

const a = [132.6, 114.1, 134.5, 124.5, 144.4, 162.4]
const b = [132.6, 134.15, 134.15, 134.15, 139.19]

Why?

Take a look here: http://www.spiderfinancial.com/support/documentation/numxl/reference-manual/forecasting-performance/mape — SimpleOne, May 10 '19 at 14:11
Something went wrong with my last comment so repeating: Take a look [here:](http://www.spiderfinancial.com/support/documentation/numxl/reference-manual/forecasting-performance/mape) MAPE can be more than 100 if the values are signigicantly different. You can use SMAPE too which maybe more what you are looking for - see [here](https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error) — SimpleOne, May 10 '19 at 14:17
Also, this is not really a coding issue as your code is fine. This is more of a statistics questions and you may find it better to ask on https://stats.stackexchange.com what is the most appropriate stat to use. — SimpleOne, May 10 '19 at 15:44

score 2 · Answer 1 · answered May 08 '19 at 14:08

2

I guess the measure you're looking for is actually MPE and not MSE.

function mpe(a, f) {
    let size = a.length, sum = 0;
    for (let i = 0; i < size; i++) {
        sum += (a[i] - f[i]) / a[i];
    }
    return 100 * sum / size;
}


// small demo

forecast = [10, 20, 30]
actual   = [10, 20, 30]

for(i = 0; i < 20; i++) {
    console.log(actual.join() + ' mpe ' + mpe(actual, forecast).toFixed(1) + '%')
    actual[i % 3] += 10;


}

answered May 08 '19 at 14:08

georg

211,518
52
313
390

@beth: add `Math.abs` then: https://en.wikipedia.org/wiki/Mean_absolute_percentage_error. Also follow the links therein to find out which exactly measure you're after. – georg May 08 '19 at 14:20
I edit my main code, could you help me? I feel a bit stupid – May 08 '19 at 14:48
@beth: your code appears to be correct. MAPE can be greater than 100% if the actual value is greater than forecast x 2 – georg May 08 '19 at 15:08
Mmm ok. How can I transform it to a percent value? – May 08 '19 at 15:26
@beth: assume you transform it and it returns say 87%. How would you interpret that? 87% of what? – georg May 08 '19 at 18:06
Are you telling me that it makes no sense to use a percentage value to represent the error? I need an easy way to explain to user how much his response is wrong compared to the correct one. Absolute number are not intuitive – May 08 '19 at 19:53
@beth: I understand what you're trying to do, and MAPE and other error measures provide exactly that. However, if you want the response to be scaled in some range, like no more than 100%, you need that scale be defined somehow. Like limit max possible error to 1,000 points and take a percentage of that. – georg May 09 '19 at 07:24
Thanks Georg for your patience. Can you explain better what you mean? Because I don't understand – May 09 '19 at 07:29
@beth: you wrote "I would like a percentage number to tell me... etc". But a percentage is always related to some base value, it's not just "87%", it's "87% of something". What I'm trying to say is that you have to define what your base value would be. – georg May 09 '19 at 07:33
Ok. What I have are X `[x_min, x_max]` and Y `[y_min, y_max]` range and the result should be in range `[0, 100]`. Now? Sorry but I really don't understand.. – May 09 '19 at 07:42

vrintle · Accepted Answer · 2019-05-15T03:11:39.680

Well, it depends on the meaning of 100%. If 100% (in your case) is maximum possible deviation from the actual data, then you've to define some limits on your output.

For example, try:

function computeError(obj) {
  let size = obj.actual.length;
  let maxErr = obj.limits[1] - obj.limits[0];
  let error = 0;
  let i;
  
  for (i = 0; i < size; i++) {
    error += Math.abs((obj.actual[i] - obj.forecast[i]) / maxErr);
  }
  
  console.log( ((100 * error) / size).toFixed(3), '%' );
}

const testCases = [
  {
    actual: [2354493, 2615706, 1594281, 1570894, 1930709, 2086681],
    forecast: [2354493, 2224360.55, 1906806.9, 1408769.93, 1609053.96, 2200698.72],
    limits: [0, 5e6] // [0, 5000000]
  },
  {
    actual: [77, 50, 38, 30, 26, 18],
    forecast: [77, 81.13, 92.77, 101.98, 119.76, 121.26],
    limits: [0, 2e2] // [0, 200]
  },
  {
    actual: [1.15, 1.09, 1.08, 0.78, 0.51, 0.44],
    forecast: [1.15, 1.61, 1.88, 2.13, 2.3, 2.47],
    limits: [0, 3e0] // [0, 3]
  },
  
  // extra cases
  {
    actual: [0, 0, 0],
    forecast: [1, 1, 1],
    limits: [0, 1e0]
  },
  {
    actual: [1, 1, 1],
    forecast: [1, 1, 1],
    limits: [0, 1e9]
  }
]; 

testCases.forEach(computeError); // calls computeError function on each object

score 0 · Answer 3 · answered May 13 '19 at 18:16

A dumb but practical approach would be to simply consider the error, computed by any distance you like between your sample and your prediction (norm1, norm2, whatsoever)

then map the resulting value to the interval [0;1] by a function of your choice (fulfilling f:[0; infty]->[0;1])

e.g:

f(err) = e^(-x^2/a^2) with a of your choice

in your code that could be like

var err = computeMse(a,b)
function toPercent(err){
    const a = 1;
    return Math.exp(-x*x/a*a);
}
var percent = toPercent(err)

How to find a precent value that represents how much two arrays are different?

3 Answers3