How to perform cohort analysis using Snowflake-SQL?

Question

I have two straightforward tables on Snowflake:

registration table record the month when the user registered the account, for example:

registered_month	user_id
2018-01-01	352
2018-01-01	421
2018-02-01	534
2018-03-01	678

...

activity table records the month when the users are active, for example:

active_month	user_id
2018-01-01	381
2019-03-01	427
2021-10-01	1085
2020-10-01	4973

...

A user registered in '2018-01-01' can be active in '2018-01-01' month, '2018-02-01' month, skip '2018-03-01' month(Not active this month), and come back in '2018-04-01' month. The user_id will be counted as remaining_size in '2018-01-01', '2018-02-01', and '2018-04-01'.

The result table I want is something like this, for example:

acctCreated	Evaluation_Month	months_passed	original_size	remaining	retention
2018-01-01	2018-01-01	0	500	500	1
2018-01-01	2018-02-01	1	500	480	0.96
2018-01-01	2018-03-01	2	500	455	0.91
2018-01-01	2018-04-01	3	500	425	0.85
2018-01-01	2018-05-01	4	500	400	0.8

...... Month over month until the current month, the 2021-09-01 month

I really want to provide my solution, but I am really a novice on this and felt a bit lost on solving this issue

With new_user as (
SELECT registered_month, user_id
FROM registration
ORDER BY 1, 2
),
active_user as (
SELECT active_month, user_id
FROM activity
ORDER BY 1, 2
)
SELECT a.active_month, 
       DATEDIFF(n.registered_month, a.active_month) as months_passed,
       COUNT(a.user_id) as counter
FROM active_user a LEFT JOIN new_user n 
                          ON a.user_id = n.user_id 
                         AND a.active_month >= n.registered_month
WHERE n.user_id is NOT NULL
GROUP BY 1,2
ORDER BY 1,2,3

I am thinking about something like this. Could anyone please help me with this kind of cohort analysis/retention rate analysis issue with SQL? Thanks a lot!!!

I would start by writing a cte for the date columns and then grouping, filtering, counting in the main query. https://stackoverflow.com/questions/54348801/generate-series-equivalent-in-snowflake — AJcodez, Sep 30 '21 at 03:39

How to perform cohort analysis using Snowflake-SQL?

0 Answers0