-1

I have a time-series table like this:

Key | Value | Epoch

It could store many values for each Key, all marked with when then item was generated (epoch time). Let's say I have 100 values per key, I want to average the value of the last 50, but only if has a statistically significant amount of data, let's say 20. Currently, I have:

select key, avg(value)
from t
where
epoch > (select epoch from t order by epoch desc limit 50, 1)
group by key
having count(*) > 20

This performs pretty slowly and it uses a MySQL extension on the limit keyword. Is there a better way to do this?

Jeff Wang
  • 1,837
  • 1
  • 15
  • 29

2 Answers2

0

Use window functions:

select t.key, avg(t.value)
from (select t.*,
             row_number() over (partition by key order by epoch desc) as seqnum,
             count(*) over (partition by key) as cnt
      from t
     ) t
where seqnum <= 50 and cnt >= 20
group by t.key;

Both this and your version should be able to take advantage of an index on (key, epoch desc).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Use COUNT() and ROW_NUMBER() window functions:

select t.key, avg(t.value) avg_value
from (
 select *,
    count(*) over (partition by key) counter,
    row_number() over (partition by key order by epoch desc) rn
  from tablename
) t
where t.counter > 20 and t.rn <= 50
group by t.key
forpas
  • 160,666
  • 10
  • 38
  • 76