0

I have a table name p_Details with (id,eff_dt,amt,change,month). For each ID and change combination there are 12 records for 12 months. Data looks like below.

id eff_dt amt change month
100 1-Jan 300 1 1
100 1-Jan 300 1 2
100 1-Jan 300 1 3
100 1-Jan 300 1 4
.. .. .. .. ..
100 1-Mar Null 2 1
100 1-Mar Null 2 2
100 1-Mar 400 2 3
100 1-Mar 400 2 4
.. .. .. .. ..
100 1-Apr Null 3 1
100 1-Apr Null 3 2
100 1-Apr Null 3 3
100 1-Apr 500 3 4

If any combination of Id and change is having Null value for a particular month, then amt from same id and same month but from previous change number will be copy over to the present row. Final table should look like below after update.

id eff_dt amt change month
100 1-Jan 300 1 1
100 1-Jan 300 1 2
100 1-Jan 300 1 3
100 1-Jan 300 1 4
.. .. .. .. ..
100 1-Mar 300 2 1
100 1-Mar 300 2 2
100 1-Mar 400 2 3
100 1-Mar 400 2 4
.. .. .. .. ..
100 1-Apr 300 3 1
100 1-Apr 300 3 2
100 1-Apr 400 3 3
100 1-Apr 500 3 4

Same Id and month combination can have N number of change numbers. I are trying to use lag() function but it is not working.

Below is my query which is not giving desired result.

select
  case when amt is null then amt_new else amt end as amt,
  id,
  eff_dt,
  change,
  month
from (
  select
    id,
    eff_dt,
    change,
    month,
    amt,
    LAG(amt) OVER(partition id,month order by change) as amt_new
  from p_Details
) e
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • "*I are trying to use lag() function but it is not working.*" - please [edit] your question to show us the query you've tried – Bergi Apr 07 '23 at 18:01
  • @Bergi I have added the query – SK ASIF ALI Apr 08 '23 at 05:28
  • What exactly is not working with that query? – Bergi Apr 08 '23 at 14:55
  • To get the *last non-null* value in the window, you might want to use `LAG (amt) FILTER (WHERE amt IS NOT NULL) OVER (PARTITION BY id, month ORDER BY change)` - see https://stackoverflow.com/questions/41796929/conditional-lead-lag-function-postgresql – Bergi Apr 08 '23 at 14:59
  • @Bergi, in PostgreSQL (at least through 15.2), `FILTER` is not implemented for non-aggregate window functions such as `lead` and `lag`. – JohnH Apr 08 '23 at 17:39
  • @JohnH Ah, I misunderstood Erwin's answer to the question I linked, as if it was implemented in Postgres 13. But your answer here has a nice workaround :-) – Bergi Apr 08 '23 at 19:15

1 Answers1

2

The intuitive approach to addressing the specified requirement is to use LAG to get the most recent non-NULL value. Unfortunately, that approach depends on features that aren't supported in PostgreSQL. One might next consider using FIRST_VALUE or LAST_VALUE, but the same limitations apply. Subqueries are an option, but the resulting code can seem complex and might perform poorly. Let's try a different approach.

FILTER is not implemented for non-aggregate window functions such as LEAD and LAG; however, it is available for aggregate window functions such as ARRAY_AGG. For data sets where the windowed partitions are large, this approach might be impractical. For the use case the OP presented, the partitions are sufficiently small for the following query to work efficiently:

WITH p_details (
  id,
  eff_dt,
  amt,
  change,
  month
) AS (
  VALUES (100, '1-Jan', 300, 1, 1),
    (100, '1-Jan', 300, 1, 2),
    (100, '1-Jan', 300, 1, 3),
    (100, '1-Jan', 300, 1, 4),
    (100, '1-Mar', NULL, 2, 1),
    (100, '1-Mar', NULL, 2, 2),
    (100, '1-Mar', 400, 2, 3),
    (100, '1-Mar', 400, 2, 4),
    (100, '1-Apr', NULL, 3, 1),
    (100, '1-Apr', NULL, 3, 2),
    (100, '1-Apr', NULL, 3, 3),
    (100, '1-Apr', 500, 3, 4))
SELECT
  id,
  eff_dt,
  COALESCE(amt, (ARRAY_AGG(amt) FILTER (WHERE amt IS NOT NULL)
                                OVER (PARTITION BY id, month ORDER BY change DESC
                                      ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING))[1]) AS amt,
  change,
  month
FROM
  p_details
ORDER BY
  id,
  change,
  month;

This query finds the substitution value by taking the first element of an array consisting of prior non-NULL values sorted in descending change order. Because the values are sorted in descending change order, the concepts of past and present are inverted; therefore, the window definition references FOLLOWING instead of PRECEDING to access prior changes.

JohnH
  • 2,001
  • 1
  • 2
  • 13
  • Thanks a lot @JohnH, it really works for me. If you could explain the query in more details it will be really helpful. – SK ASIF ALI Apr 10 '23 at 09:26
  • Play around with the query to investigate how it works. A good place to start would be to add `(ARRAY_AGG(amt) FILTER (WHERE amt IS NOT NULL) OVER (PARTITION BY id, month ORDER BY change DESC ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)) AS past_changes` to the select list and then vary the filter, ordering, and window options to see their effects. That kind of experimentation will teach you much more than additional exposition on the workings for this particular case. – JohnH Apr 10 '23 at 21:55