-2

I made this SQL code which works fine but takes 2 minutes to load. I want to become better at getting faster code. This code builds a bar chart/counts for counts of user activity and invites in each quarter of 2019. You can see I had to define each quarter for that year, which isnt ideal as i'd have to do a lot of manual stuff when putting more quarters! How can I improve this code please? Thank you!


WITH teams AS (SELECT cc.id , cc.name as company_name,clg.id as team_id, 
(select count(*) from companies_learnergroup clg where clg.company_id = cc.id) as number_of_teams_in_company, clg.name as team_name, clg.location_address,
(select count(*) from auth_user u where u.company_id = cc.id) as company_user_count,
(select count(*) from auth_user u where u.team_id = clg.id) as team_user_count,
(select count(*) from auth_user u where u.team_id = clg.id AND  ((u.last_activity BETWEEN 'January 01, 2019, 00:00 AM' AND 'March 31, 2019, 11:59 PM') or (u.last_login BETWEEN 'January 01, 2019, 00:00 AM' AND 'March 31, 2019, 11:59 PM'))) as active_users_q1_2019,
(select count(*) from auth_user u, companies_invitation ci where ci.learner_group_id = clg.id  and u.id = ci.inviter_id and ci.learner_group_id = u.team_id and ci.created_at BETWEEN 'January 01, 2019, 00:00 AM' AND 'March 31, 2019, 11:59 PM') as number_of_users_invited_to_team_by_user_in_same_team_in_q1_2019, 
(select count(*) from auth_user u where u.team_id = clg.id and (u.last_activity BETWEEN 'April 01, 2019, 00:00 AM' AND 'June 30, 2019, 11:59 PM') or (u.last_login BETWEEN 'April 01, 2019, 00:00 AM' AND 'June 30, 2019, 11:59 PM'))) as active_users_q2_2019,
(select count(*) from auth_user u, companies_invitation ci where ci.learner_group_id = clg.id  and u.id = ci.inviter_id and ci.learner_group_id = u.team_id  and ci.created_at BETWEEN 'April 01, 2019, 00:00 AM' AND 'June 30, 2019, 11:59 PM') as number_of_users_invited_to_team_by_user_in_same_team_in_q2_2019, 
(select count(*) from auth_user u where u.team_id = clg.id and (u.last_activity BETWEEN 'July 01, 2019, 00:00 AM' AND 'September 30, 2019, 11:59 PM') or (u.last_login BETWEEN 'July 01, 2019, 00:00 AM' AND 'September 30, 2019, 11:59 PM'))) as active_users_q3_2019,
(select count(*) from auth_user u, companies_invitation ci where ci.learner_group_id = clg.id  and u.id = ci.inviter_id and ci.learner_group_id = u.team_id and ci.created_at BETWEEN 'July 01, 2019, 00:00 AM' AND 'September 30, 2019, 11:59 PM') as number_of_users_invited_to_team_by_user_in_same_team_in_q3_2019, 
(select count(*) from auth_user u where u.team_id = clg.id  AND  ((u.last_activity BETWEEN 'October 01, 2019, 00:00 AM' AND 'December 31, 2019, 11:59 PM') or (u.last_login BETWEEN 'October 01, 2019, 00:00 AM' AND 'December 31, 2019, 11:59 PM'))) as active_users_q4_2019,
(select count(*) from auth_user u, companies_invitation ci where ci.learner_group_id = clg.id  and u.id = ci.inviter_id and ci.learner_group_id = u.team_id and ci.created_at BETWEEN 'October 01, 2019, 00:00 AM' AND 'December 31, 2019, 11:59 PM') as number_of_users_invited_to_team_by_user_in_same_team_in_q4_2019, 
cc.created_at as company_account_created, 
clg.created as team_account_created, cc.office_location, cc.region_of_responsibility, cc.company_type,
cc.company_url,  string_agg(ctag.name, ', ') as retailer_tags FROM companies_company cc
JOIN companies_learnergroup clg on clg.company_id = cc.id
JOIN company_companytag cctag ON cc.id = cctag.company_id
JOIN companies_tag ctag ON ctag.id = cctag.tag_id
WHERE cc.company_type = 'retailer'
and cc.deactivated is null
GROUP BY cc.id, cc.name, clg.id
ORDER BY cc.id) 
     
    SELECT COUNT(*) as "Amount", '2019 Q1 Retailer teams with at least 1 active user and at least 1 invite sent from user in same team' as "Filter" FROM teams 
    WHERE active_users_q1_2019 > 0 AND number_of_users_invited_to_team_by_user_in_same_team_in_q1_2019 > 0
   
    
    UNION
    SELECT COUNT(*) as "Amount", '2019 Q2 Retailer teams with at least 1 active user and at least 1 invite sent from user in same team' FROM teams 
    WHERE active_users_q2_2019 > 0 AND number_of_users_invited_to_team_by_user_in_same_team_in_q2_2019 > 0
    

    
    UNION
    SELECT COUNT(*) as "Amount", '2019 Q3 Retailer teams with at least 1 active user and at least 1 invite sent from user in same team' FROM teams 
    WHERE active_users_q3_2019 > 0 AND number_of_users_invited_to_team_by_user_in_same_team_in_q3_2019 > 0
   
    
    UNION
    SELECT COUNT(*) as "Amount", '2019 Q4 Retailer teams with at least 1 active user and at least 1 invite sent from user in same team in Q4 2019' FROM teams 
    WHERE active_users_q4_2019 > 0 AND number_of_users_invited_to_team_by_user_in_same_team_in_q4_2019 > 0
    

    
) funnel
ORDER BY "Filter"`

   
aquamad96
  • 63
  • 5
  • If this request is frequent but updates to tables are rather not maybe a [materialised view](https://www.postgresql.org/docs/14/rules-materializedviews.html) is of interest – possibly with update of getting triggered with update of involved tables. – Aconcagua Oct 14 '21 at 08:19
  • What's the biggest table here? How many rows in `auth_user` ? What indexes exist? – Nick.Mc Oct 14 '21 at 09:31
  • I see some implicit joins in the subqueries. Given the complexity of the query, using an explicit join may change the execution plan and, therefore, the performance. At least, that is what the answers to [this](https://stackoverflow.com/questions/1129923/is-a-join-faster-than-a-where) old question suggest. – Daniel Dearlove Oct 14 '21 at 10:44
  • Where is the time being taken, running the query, moving the data to the front end, building the charts? Run [EXPLAIN ANALYZE](https://www.postgresql.org/docs/current/sql-explain.html) on the query to see the time for the query itself. Also add the output to your question. – Adrian Klaver Oct 14 '21 at 15:11
  • Short comment: OMG. – wildplasser Oct 15 '21 at 22:42

1 Answers1

0

Not directly related to performance, except that as posted the query not run. It contains missing open parentheses on the 7th and 9th Selects. The segment and (u.last_activity ... should be and ((u.last_activity .... As it is in the 5th and 11th selects, Also ) funnel generates an error. But I will just consider these and in building the post.
The big issue is your BETWEEN statements will not work, they will run but will not produce correct results. This is because you are doing a TEXT compare for MONTH Names. Unfortunately as text compare Postgres knows nothing about the calendar. Assuming u.last_activity actually contains a date you can get the quarter directly by just extract('quarter' from u.last_activity). If it is not a date (timestamp) then maybe it is not to late to change your data model. See demo for which quarters get which months by text compare. Also see Postgres extract

Belayer
  • 13,578
  • 2
  • 11
  • 22