4

enter image description here

On the left panel data without IGNORE NULLS.
On the right panel data with IGNORE NULLS.

So I need to get right variant in PostgreSQL

Need to emulate Oracle IGNORE NULLS in window functions (LEAD and LAG) in PostgreSQL.

SELECT empno,
   ename,
   orig_salary,
   LAG(orig_salary, 1, 0) IGNORE NULLS OVER (ORDER BY orig_salary) AS sal_prev
FROM   tbl_lead;

If there are NULL, it should return the latest not null value.

I've tried it via PostgreSQL user defined aggregate functions, but it's rather hard to understand methodology of it https://www.postgresql.org/docs/9.6/static/sql-createaggregate.html

The solution can't be realized via WITH clause or sub-query since it it used in complex query.

2 Answers2

3

I have updated @klin 's answer. Below functions allows to pass anyelement, has offset and default parameters.

LAG ( expression [, offset [, default] ] )

create or replace function swf_lag_trans(anyarray, anyelement, integer, 
anyelement)
returns anyarray language plpgsql as $$
begin
if $1 is null then
    $1:= array_fill($4, array[$3+1]);
end if;
if $1[$3+1] is not null then 
for i in 1..$3 loop
        $1[i]:= $1[i+1];
        i := i+1;
    end loop;
    $1[$3+1]:= $2;
end if;
return $1;
end $$;
create or replace function swf_lag_final(anyarray)
returns anyelement language sql as $$
select $1[1];
$$;
create aggregate swf_lag(anyelement, integer, anyelement) (
sfunc = swf_lag_trans,
stype = anyarray,
finalfunc = swf_lag_final
);

And usage:

with my_table(name, salary) as (
values
    ('A', 100),
    ('B', 200),
    ('C', 300),
    ('D', null),
    ('E', null),
    ('F', null)
)

select 
    name, salary, 
    lag(salary, 2, 123) over (order by salary) prev_salary,
    swf_lag(salary, 2, 123)  over (order by salary) my_prev_salary
from my_table;

enter image description here

It works for me. Please, correct, if required.

2

The aggregate is a bit complicated, because you have to store two previous values. It can be done using an array as a state-data and a final function:

create or replace function my_lag_trans_fun(numeric[], numeric)
returns numeric[] language plpgsql as $$
begin
    if $1[2] is not null then 
        $1[1]:= $1[2];
        $1[2]:= $2;
    end if;
    return $1;
end $$;

create or replace function my_lag_final_fun(numeric[])
returns numeric language sql as $$
    select $1[1];
$$;

create aggregate my_lag(numeric) (
    sfunc = my_lag_trans_fun,
    stype = numeric[],
    initcond = '{0,0}',
    finalfunc = my_lag_final_fun
);

Usage:

with my_table(name, salary) as (
values
    ('A', 100),
    ('B', 200),
    ('C', 300),
    ('D', null),
    ('E', null),
    ('F', null)
)

select 
    name, salary, 
    lag(salary, 1, 0) over (order by salary) prev_salary,
    my_lag(salary) over (order by salary) my_prev_salary
from my_table;

 name | salary | prev_salary | my_prev_salary 
------+--------+-------------+----------------
 A    |    100 |           0 |              0
 B    |    200 |         100 |            100
 C    |    300 |         200 |            200
 D    |        |         300 |            300
 E    |        |             |            300
 F    |        |             |            300
(6 rows)
klin
  • 112,967
  • 15
  • 204
  • 232
  • You don't need a global variable, because the previous state is passed into the transition function. The manual illustrates this in pseudo code as `sfunc( internal-state, next-data-values ) ---> next-internal-state` so if you were naming your variables, you might declare your function as `my_lag_trans_fun(previous_state numeric, next_input numeric)`. You also don't need to check for `NULL` explicitly, you can just declare your function with the `STRICT` keyword, and Postgres will skip calling it for null inputs. – IMSoP Dec 11 '17 at 09:50
  • @IMSoP - `internal-state` is insufficient in this case, you must have two previous ones. – klin Dec 11 '17 at 10:01
  • Ah, I see. I think the appropriate trick in that case is for the state type to not be the same as the final type. So you could define an aggregate type with `CREATE TYPE lag_state_type(a numeric, b numeric)`, then declare `my_lag_trans_fun(previous_state lag_state_type, next_input numeric) returns lag_state_type` and `my_lag_finalise_fun(lag_state_type) returns numeric`. Or you could use `numeric[]` as the state type, and pass around the lag pairs as arrays. – IMSoP Dec 11 '17 at 10:15
  • No, it's completely transparent, because `stype` is the type of the *internal state* of the aggregate, not its return type. That's what the "finalise function" is for: it takes the last state, and converts it to the final result. The manual outlines it as `ffunc( internal-state ) ---> aggregate-value`. It's a complex system to get your head around, but it's very powerful. – IMSoP Dec 11 '17 at 10:24
  • @IMSoP - you're right. I thought the solution with GUC will be simpler, but it has an important drawback. Thanks for your stimulatory notes! – klin Dec 11 '17 at 10:47
  • @klin - Oh, great! It really works for me. I've also tried to implement in the same way LEAD function, but I do not know how I can view the next row for current one. I suppose that there should be some minvfunc function, but I have not found any examples or detailed documentation about it. And one more question: How I can pass variable start value for LEAD and LAG functions? I mean how set parameters values to initcond parameter of AGGREGATE function? Need to emulate default value for these functions – Walentyna Juszkiewicz Dec 12 '17 at 09:59
  • 1
    I'm afraid that `lead()` cannot be simulated by a custom aggregate and there is no sql/plpgsql API for creating a custom window function. See also [Creating custom window function.](https://stackoverflow.com/questions/19580448/creating-custom-window-function) Adding a default value to a custom aggregate is relatively easy, see [dbfiddle.](http://dbfiddle.uk/?rdbms=postgres_10&fiddle=1037adc7540354a2461eb6d7321e251c) – klin Dec 12 '17 at 11:10
  • @klin thank you for your answers. They ate really helpful for me. I've updated your solutions for using offset and default for LAG. And also now it can takes anyelement. See described below solution – Walentyna Juszkiewicz Dec 12 '17 at 14:36