Having a de-normalized structure in Redshift and plan is to keep creating records and while retrieving only consider most recent attributes against users.
Following is the table:
user_id state created_at
1 A 15-10-2015 02:00:00 AM
2 A 15-10-2015 02:00:01 AM
3 A 15-10-2015 02:00:02 AM
1 B 15-10-2015 02:00:03 AM
4 A 15-10-2015 02:00:04 AM
5 B 15-10-2015 02:00:05 AM
And required result set is:
user_id state created_at
2 A 15-10-2015 02:00:01 AM
3 A 15-10-2015 02:00:02 AM
4 A 15-10-2015 02:00:04 AM
I have the query which retrieve the said result:
select user_id, first_value AS state
from (
select user_id, first_value(state) OVER (
PARTITION BY user_id
ORDER BY created_at desc
ROWS between UNBOUNDED PRECEDING and CURRENT ROW)
from customer_properties
order by created_at) t
where first_value = 'A'
Is this the best way to retrieve or can the query be improved?