dense_rank()
gives the dense ranking of the the current record. When you run that with ASC
sort order first, you get the current record's dense rank (unique value rank) from the first element. When you run with DESC
order, then you get the current record's dense rank from the last record. Then you remove 1 because the dense ranking of the current record is counted twice. This gives the total unique values in the whole partition (and repeated for every row).
Since, dense_rank
does not support frames
, you can't use this solution directly. You need to generate the frame
by other means. One way could be JOIN
ing the same table with proper unique id
comparisons. Then, you can use dense_rank
on the combined version.
Please check out the following solution proposal. The assumption there is you have a unique record key (record_id
) available in your table. If you don't have a unique key, add another CTE before the first CTE and generate a unique key for each record (using new_id()
function OR combining multiple columns using concat()
with delimiter in between to account for NULLs
)
; WITH cte AS (
SELECT
record_id
, record_id_6_record_earlier = LEAD(machine_id, 6, NULL) OVER (PARTITION BY model ORDER BY _timestamp)
, .... other columns
FROM mainTable
)
, cte2 AS (
SELECT
c.*
, DistinctCntWithin6PriorRec = dense_rank() OVER (PARTITION BY c.model, c.record_id ORDER BY t._timestamp)
+ dense_rank() OVER (PARTITION BY c.model, c.record_id ORDER BY t._timestamp DESC)
- 1
, RN = ROW_NUMBER() OVER (PARTITION BY c.record_id ORDER BY t._timestamp )
FROM cte c
LEFT JOIN mainTable t ON t.record_id BETWEEN c.record_id_6_record_earlier and c.record_id
)
SELECT *
FROM cte2
WHERE RN = 1
There are 2 LIMITATIONS of this solution:
If the frame has less than 6 records, then the LAG()
function will be NULL
and thus this solution will not work. This can be handled in different ways: One quick way I can think of is to generate 6 LEAD columns (1 record prior, 2 records prior, etc.) and then change the BETWEEN
clause to something like this BETWEEN COALESCE(c.record_id_6_record_earlier, c.record_id_5_record_earlier, ...., c.record_id_1_record_earlier, c.record_id) and c.record_id
COUNT()
does not count NULL
. But DENSE_RANK
does. You need account for that too if it applies to your data