0

I have a data set that includes customer id, sign up date and cancel date. I want to create a table that shows active customers at the 1st of every month for the past 12 months.

Here's a screen shot of my data

I have tried using this Stack Overflow question but couldn't work it out.

It would be great if the table had a 'month' column and a column for number of active customers.

Any help welcome, please.

Stocka
  • 3
  • 2
  • 1
    Please tag your question with the database that you are running: mysql, oracle, sql-server...? The answer is likely to be database-specific. – GMB May 26 '20 at 12:14
  • Is that sample table data or the expected result? We want both anyway - as formatted text, not images. – jarlh May 26 '20 at 12:25
  • How can you have customers that have cancelled before the signed up? –  May 26 '20 at 13:55
  • I would say missed in oversight while populating sample data. – VTi May 26 '20 at 14:33

2 Answers2

2

If you are using Postgres, you can use something like this:

select to_char(g.dt::date, 'yyyy-mm'),
       count(c.customer_id) as active_customers
from generate_series(date_trunc('month', current_date) - interval '11 month', date_trunc('month', current_date), interval '1 month') as g(dt)
  left join customer c on daterange(c.signup_date, cancel_date) @> g.dt::date
group by to_char(g.dt::date, 'yyyy-mm')
order by 1

This assumes that customer_id = 6 in your sample data is an error as it has a signup_date that lies after the cancellation date.

Online example

0

Edit* - Working example with Sql Server

  1. you will ideally need here is a date/calendar table in Database with all the possible dates ( using this Get all dates between two dates in SQL Server )
  2. then join with it to find all the possible dates/month between a customer signup date and cancel date (with some conditions based on whether you treat a customer as active in a month if he cancelled on 1st of that particular month OR is a customer considered active in a certain month if he signed up late or in middle of that month)
  3. NULLs means there is no cancellation date yet for the customer which means he is very much active.

    WITH cust (customerid,signupdate,canceldate) 
    AS 
    (SELECT 1, '2019-02-10' , '2020-03-05' UNION
     SELECT 2, '2018-12-01' , '2019-12-29' UNION
     SELECT 3, '2014-04-15' , '2019-09-15' UNION
     SELECT 4, '2019-02-10' , '2020-01-04' UNION
     SELECT 5, '2020-02-10' ,  NULL        UNION
     SELECT 6, '2019-07-10' , '2018-01-01' UNION
     SELECT 7, '2019-09-10' , '2020-05-01' UNION
     SELECT 8, '2019-02-06' ,  NULL        UNION
     SELECT 9, '2018-02-10' , '2020-07-29' UNION
     SELECT 10, '2020-02-10' , '2020-04-01' 
    )
    
    Select CONVERT(CHAR(6),cal.record_date,112) as yyyymm , 
           COUNT(DISTINCT c.customerid) as Active_Custs
    FROM tbl_Calendar cal
    JOIN CUST c ON CONVERT(CHAR(6),cal.record_date,112) BETWEEN CONVERT(VARCHAR(6),cast(c.signupdate as date),112) and CONVERT(CHAR(6),ISNULL(cast(c.canceldate as date),getdate()),112) 
    GROUP BY CONVERT(CHAR(6),cal.record_date,112)
    ORDER BY 1
    
    yyyymm  Active_Custs
    202007  1
    202006  1
    202005  4
    202004  5
    202003  6
    202002  6
    202001  5
    
VTi
  • 1,309
  • 6
  • 14
  • Glad it has helped you. Please consider voting up the answer as well as it matters in SO. – VTi May 30 '20 at 09:45