1

I'm trying to calculate the number of null values between dates.

My table looks like this:

transaction_date    transaction_sale
10/1/2018           NULL
11/1/2018           33
12/1/2018           NULL
1/1/2019            NULL
2/1/2019            NULL
3/1/2019            2
4/1/2019            NULL
5/1/2019            NULL
6/1/2019            10

I'm looking to get the following output:

transaction_date    transaction_sale   count
10/1/2018           NULL               NULL
11/1/2018           33                 1
12/1/2018           NULL               NULL
1/1/2019            NULL               NULL
2/1/2019            NULL               NULL
3/1/2019            2                  3
4/1/2019            NULL               NULL
5/1/2019            NULL               NULL
6/1/2019            10                 2
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
smileyface
  • 51
  • 5

4 Answers4

1

This doesn't make any assumptions about consecutive dates, etc.

with data as (
    select transaction_date, transaction_sale,
        count(transaction_sale)
            over (order by transaction_date desc) as grp
    from T /* replace with your table */
)
select transaction_date, transaction_sale,
    case when transaction_sale is null then null else
        count(case when transaction_sale is null then 1 end)
            over (partition by grp) end as "count"
from data
order by transaction_date;

See a demo here. Although the demo is SQL Server it should work identically on your platform: https://rextester.com/GVR65084

Also see for PostgreSQL: http://sqlfiddle.com/#!15/07c85f/1

shawnt00
  • 16,443
  • 3
  • 17
  • 22
1

count(expression) does not count NULL values, be it as aggregate function or as window function. The manual:

number of input rows for which the value of expression is not null

This is the key element for a simple and fast query.

Assuming transaction_date is UNIQUE like your example suggests, or you'll have to define how to break ties between duplicate values. (An actual table definition would clarify.)

SELECT transaction_date, transaction_sale
     , CASE WHEN transaction_sale IS NOT NULL
            THEN count(*) OVER (PARTITION BY grp) - 1
       END AS count 
FROM  (
   SELECT *
        , count(transaction_sale) OVER (ORDER BY transaction_date DESC) AS grp
   FROM   tbl
   ) sub
ORDER  BY transaction_date;

Form groups in the subquery. Since every nonnull value starts a new group according to your definition, just count actual values in descending order in a window function to effectively assign a group number to every row. The rest is trivial.

In the outer SELECT, count the rows per group and display where transaction_sale IS NOT NULL. Fix off-by-1. Voilá.

Related:

Alternatively, count with FILTER (WHERE transaction_sale IS NULL) - useful for related cases where we cannot simply subtract 1:

SELECT transaction_date, transaction_sale
     , CASE WHEN transaction_sale IS NOT NULL
            THEN count(*) FILTER (WHERE transaction_sale IS NULL)
                          OVER (PARTITION BY grp)
       END AS count 
FROM  (
   SELECT *
        , count(transaction_sale) OVER (ORDER BY transaction_date DESC) AS grp
   FROM   tbl
   ) sub
ORDER  BY transaction_date;

About the FILTER clause:

db<>fiddle here

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

If the dates are consecutive, you can use the following to get the previous date:

select t.*,
       max(transaction_date) filter where (transaction_sale is not null) over (order by transaction_date order by transaction date rows between unbounded preceding and 1 preceding)
from t;

If the difference is less than 12, you can use age() and extract():

select t.*,
       extract(month from
               age(max(transaction_date) filter where (transaction_sale is notnull)
                       over (order by transaction_date order by transaction date rows between unbounded preceding and 1 preceding
                            ), transaction_date
                   )
               ) as diff
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

If transaction date is a date field you can simply use:

select count(*) from Counter where transaction_date > date_lower and transaction_date < date_higher and tx_sale is null;
ShareLock
  • 16
  • 2