1

How can I replace avg with a median calculation in this?

select *
, coalesce(val, avg(val) over (order by t rows between 3 preceding and 1 preceding)) as fixed
from (
    values
    (1, 10),
    (2, NULL),
    (3, 10),
    (4, 15),
    (5, 11),
    (6, NULL),
    (7, NULL),
    (8, NULL),
    (9, NULL)
) as test(t, val)
;

Is there a legal version of this?

percentile_cont(0.5) within group(order by val) over (order by t rows between 3 preceding and 1 preceding)
GMB
  • 216,147
  • 25
  • 84
  • 135
  • This is the closest answer I can find but it's not the same: https://stackoverflow.com/questions/39683330/percentile-calculation-with-a-window-function – Michael Grazebrook Feb 13 '20 at 19:19

1 Answers1

0

Unfortunately percentile_cont() is an aggregate function, for which there is no equivalent window function.

One workaround is to use an inline subquery to do the aggregate computation.

If ids are always increasing, then you can do:

select 
    t.*,
    coalesce(
        t.val, 
        (
            select percentile_cont(0.5) within group(order by t1.val)
            from test t1
            where t1.id between t.id - 3 and t.id - 1
        )
    ) fixed
from test t

Otherwise, you need an additional level of nesting:

select 
    t.*,
    coalesce(
        t.val, 
        (
            select percentile_cont(0.5) within group(order by t1.val)
            from (select val from test t1 where t1.id < t.id order by t1.id desc limit 3) t1
        )
    ) fixed
from test t

Demo on DB Fiddle - both queries yield:

id |  val | fixed
-: | ---: | :----
 1 |   10 | 10   
 2 | null | 10   
 3 |   10 | 10   
 4 |   15 | 15   
 5 |   11 | 11   
 6 | null | 11   
 7 | null | 13   
 8 | null | 11   
 9 | null | null 
GMB
  • 216,147
  • 25
  • 84
  • 135
  • That looks good. Do you expect there's an efficiency difference between a subquery and a home-baked windowing function? The sub-query is clearly best if missing data is rare. – Michael Grazebrook Feb 13 '20 at 21:14