0

I have a view with the following informations:

date, n1, n2, n3, n4, day, month, year, some_id, unix_timesamp, mean, std
'29/5/2007', '28', '5', '0', '0.1786', '29', '5', '2007', '5', '1180411981', null, null
...
...

And I'd like to compute the mean and standard deviation of n4 for the past y days. The results of the view are already grouped by days.

If I were using a standard programming language, my algorithm would look like this:

//For each line
for(int i = 0; i < rows.length; i++){
  //We have advanced enough days to compute average
  //and std
  if(i > y){
     //Stores the values of the past y days
     float[] values = new float[y];
     int z = 0;
     //Loop over the past y days
     for(int j = i - y; j < i; j++){
        values[z] = rows[j].n4;
        z++;
     }
     //compute mean and std
     rows[i].mean = mean(values);
     rows[i].std = std(values);
   }
}

Now, I can't find how I could do this while staying within mysql (yes, I really have to). I started playing around with cursor and temp table as described here but I don't know how to access a given row which is needed for the inner loop. I also tried using a subquery for the inner loop but the view contains ~100k lines and it was taking ~0.3 sec per line in order to requery the view (using a function to select only the y days and compute the mean and std for a given line).

The content provided by the view is constantly updated so the mean and std have to be computed at users' request. I can't compute it once and store it in a dedicated table.

Any advice?

Mathieu Nls
  • 2,285
  • 2
  • 17
  • 32

2 Answers2

0

Why would you use a cursor for this? It is more easily expressed using a set-based query:

select avg(n4), stddev(n4)
from t
where date >= curdate() - interval by day;

For optimal performance you would want an index on (date, n4).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

The behavior you're describing requires window functions, which are being added in MySQL 8.0 and MariaDB 10.2. As of January 2018, neither of these versions are available as stable releases yet.

Once these features are available in a stable version of the database, I believe this syntax will work. (I'm guessing based on the documentation, as I don't have a MySQL 8.0 alpha installed to test against.)

SELECT
    date,
    AVG(n4) OVER w,
    STDDEV(n4) OVER w
FROM table_name
WINDOW w AS (
    ORDER BY date
    ROWS BETWEEN y PRECEDING AND CURRENT ROW
)