Limit groupings to the first x rows for each result

Question

I have a time-series table like this:

Key | Value | Epoch

It could store many values for each Key, all marked with when then item was generated (epoch time). Let's say I have 100 values per key, I want to average the value of the last 50, but only if has a statistically significant amount of data, let's say 20. Currently, I have:

select key, avg(value)
from t
where
epoch > (select epoch from t order by epoch desc limit 50, 1)
group by key
having count(*) > 20

This performs pretty slowly and it uses a MySQL extension on the limit keyword. Is there a better way to do this?

What is your version of MySql? – forpas Aug 07 '20 at 18:59 — forpas, Aug 07 '20 at 18:59

score 0 · Answer 1 · answered Aug 07 '20 at 19:03

Use window functions:

select t.key, avg(t.value)
from (select t.*,
             row_number() over (partition by key order by epoch desc) as seqnum,
             count(*) over (partition by key) as cnt
      from t
     ) t
where seqnum <= 50 and cnt >= 20
group by t.key;

Both this and your version should be able to take advantage of an index on (key, epoch desc).

score 0 · Answer 2 · answered Aug 07 '20 at 19:03

Use COUNT() and ROW_NUMBER() window functions:

select t.key, avg(t.value) avg_value
from (
 select *,
    count(*) over (partition by key) counter,
    row_number() over (partition by key order by epoch desc) rn
  from tablename
) t
where t.counter > 20 and t.rn <= 50
group by t.key

Limit groupings to the first x rows for each result

2 Answers2