SQL | Count based on each column's date value

Question

I have 3 tables:

CustomerID | PromotionDate
1111       | 01-01-2020
2222       | 03-06-2021
3333       | 05-07-2021
4444       | 09-10-2020

CustomerID | WorkID   
1111       | aaaa
1111       | bbbb
2222       | cccc
2222       | dddd
2222       | eeee

WorkID | StartDate
aaaa   | 01-01-2020
bbbb   | 01-02-2021
cccc   | 05-07-2020
dddd   | 06-08-2021
eeee   | 03-07-2022

I want to find the count of WorkIDs for each CustomerIDs where StartDate >= PromotionDate for that CustomerID.

So the result for the above sample should be:

CustomerID  | Count
1111        | 2
2222        | 2
3333        | 0
4444        | 0

How to achieve this?

What have you tried? This likes a left join from each of the tables to the next (so 2 joins) and then a simple count group by your customer ID — xQbert, Mar 29 '22 at 21:54
yes, please give the `tableNames` to the community so that they can help you faster. — Utmost Creator, Mar 29 '22 at 21:55
Redshift or Postgres? Those are two very different database products — , Mar 30 '22 at 05:59

Utmost Creator · Answer 1 · 2022-03-29T23:40:29.797

0

Replace the table names with the real names and that should work as you except.

SELECT table1.CustomerID, COALESCE(a.totalCount,0) counts FROM table1
LEFT JOIN (SELECT table1.CustomerID, COUNT(table3.WorkID) as totalCount
FROM table3
LEFT JOIN table2 ON table3.WorkID = table2.WorkID 
LEFT JOIN table1 ON table2.CustomerID = table1.CustomerID 
WHERE table3.StartDate >= table1.PromotionDate 
GROUP BY table1.CustomerID ORDER BY table1.CustomerID ASC)a ON table1.CustomerID = a.CustomerID GROUP BY table1.CustomerID

I am using aggregation function called COUNT() to calculate the number of similar occurrences of the same Customer. GROUP BY to span similar records into one and in order to use the COUNT() function. ORDER BY is used to sort it by the CustomerID

My data:

Results:

edited Mar 29 '22 at 23:40

answered Mar 29 '22 at 21:54

Utmost Creator

814
1
9
21

1

Don't think you'd group by count... perhaps CustomerID. The from has a `,` and inner join... Shouldn't mix standards. (I'd be surprised if that works) Also by using inner joins if a workID isn't in the 3rd table, it would be omitted and not get a count. – xQbert Mar 29 '22 at 21:54
could you please fix in the comments(edit my answer) if possible, how I can return 0, in case there is no record was found (of course, if it takes a few lines), or provide me with a link? thanks. – Utmost Creator Mar 29 '22 at 22:01
use LEFT JOin's instead of inner. Change group by to CustomerID instead of count but these are things you should know if you're providing an answer. use dbfiddle.uk or similar to mock up the data in CTE's or create tables; whatever; test your results and add a link (Demo) to add credence to your answer. it has the added benefit of allowing you to verify your SQL syntax and results against expected results. Editing an answer to change the CONTENT is frowned upon. To fix a simple syntax error or grammar is ok; anything more than that and one should write their own answer. – xQbert Mar 29 '22 at 22:04
2

You want to count `workID`s, so you should start `JOIN`in from that table. Question didn't specify what was wanted result if for example `promotionDate` is missing – Vesa Karjalainen Mar 29 '22 at 22:04
@VesaKarjalainen In the result set, we only want the customers from Table1. And all customers from Table1 will have a promotion date – user10096621 Mar 29 '22 at 22:12
Yeah, thanks, Indeed I have misspelled the column name in `GROUP BY` statement. I know that I must group by column name that gets spanned. I have repleced seems like most of your remarks – Utmost Creator Mar 29 '22 at 22:14

score 0 · Answer 2 · answered Mar 30 '22 at 00:04

Disclaimer: I don't use PostgreSQL so there may be a better option. However a general approach is to use conditional SUM() to count the number of records where StartDate >= PromotionDate:

SELECT t1.CustomerId 
       , SUM( CASE WHEN t3.StartDate >= t1.PromotionDate THEN 1 ELSE 0 END ) AS "Count"
FROM   Table1 t1 
          LEFT JOIN Table2 t2 ON t1.CustomerID = t2.CustomerID 
          LEFT JOIN Table3 t3 ON t2.WorkId = t3.WorkId 
GROUP BY t1.CustomerId

Results:

customerid	totalcount
1111	2
2222	2
3333	0
4444	0

db<>fiddle here

score 0 · Answer 3 · answered Mar 30 '22 at 01:36

This should work in Redshift:

SELECT p.customerid
     , count(s.startdate >= p.promotiondate OR NULL) AS total_count
FROM   cust_promo      p
LEFT   JOIN cust_work  w ON w.customerid = p.customerid
LEFT   JOIN work_start s ON s.workid = w.workid
GROUP  BY p.customerid;

db<>fiddle here

Assuming workid is UNIQUE in table work_start (as seems reasonable).

The aggregate FILTER clause performs better in modern Postgres. But Redshift doesn't keep up since its fork long ago:

SELECT p.customerid
     , count(*) FILTER (WHERE s.startdate >= p.promotiondate) AS total_count
FROM   cust_promo      p
LEFT   JOIN cust_work  w ON w.customerid = p.customerid
LEFT   JOIN work_start s ON s.workid = w.workid
GROUP  BY p.customerid;

See:

To my knowledge Redshift was never a "fork" - Amazon only took the SQL _parser_ (and some command line clients), everything else is completely different. — , Mar 30 '22 at 05:59

SQL | Count based on each column's date value

3 Answers3