1

I'm trying to accomplish the following - I want to see the session on my website per customer grouped by calendar week. Here is the query i have so far to accomplish this:

SELECT o.name
     , s.organization_id
     , count(s.id) as num_of_sessions
     , CONCAT(s.created_at, ' - ', s.created_at + INTERVAL 6 DAY) AS week
  FROM triton.sessions s
     , triton.organizations o
 where o.id=s.organization_id
   and s.organization_id in (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
 GROUP BY s.organization_id, WEEK(s.created_at)
 ORDER BY o.name, WEEK(s.created_at);

The issue with this is that weeks with where a customer did not have a website session aren't reported with a 0 - instead that week is not reported. This is an issue as I can't take the data into excel easily and create graphs for each customer's sessions.

To attempt to resolve this issue, I created a temp weeks table with values from 1-52 for each week number and attempted the approach suggested in this link: Summarise by week, even for empty rows

The challenge is when I do a left outer join, I lose the group by for organizations.

Here is a working SQL used to just group by weeks (before attempting to group by organization):

select w.weeknum
     , sess.club
     , sess.organization_id
     , count(sess.club) from weeks w 
  left outer
  join ( select o.name as club
              , s.organization_id
              , s.created_at
           from sessions s
              , organizations o
          where s.organization_id in (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
            and o.id=s.organization_id
       ) sess
    on (w.weeknum = extract(week from sess.created_at))
 group by w.weeknum

The above code just returns 52 rows (1 for each week), with the count giving me how many sessions in each week.

I now want to extend the above code to do the above but per organization. I should get back 52 * N rows where N is the number of organizations. I thought this would be as easy as adding the organization to the groupby statement, but it then only returned the weeks that had sesssions (resulting in the problem I had from the start). Here is the query:

select w.weeknum
     , sess.club
     , sess.organization_id
     , count(sess.club)
  from weeks w
  left outer
  join ( select o.name as club
              , s.organization_id
              , s.created_at
           from sessions s
              , organizations o
          where s.organization_id in (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
           and o.id=s.organization_id
       ) sess
    on (w.weeknum = extract(week from sess.created_at))
 group by sess.club, w.weeknum
 order by sess.club

Does anybody have any other suggestions where I can accomplish my goals? Essentially for every one of my customers, I want to see a list of sessions by week (even if they had no session in a particular week).

Community
  • 1
  • 1
darudude
  • 541
  • 5
  • 9
  • 21

2 Answers2

0

Use a cross join to get the rows and then a left join:

select w.weeknum, s.club, s.organization_id, count(s.club)
from weeks w cross join
     organizations o left outer join
     sessions s
     on w.weeknum = extract(week from s.created_at) and
        o.id = s.organization_id
where o.id in (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
group by w.weeknum, s.club, s.organization_id;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Just take your original query, and wrap in parens and use it as an inline view.

And it looks like you figured out you want a row source that returns all of the rows you want returned, and then a left join to your inline view.

You got this part:

 from weeks w

just do a cross join to all of the organization_id you want to return.

It looks like organization_id may be the primary key of the organizations table. If that's the case, then this query would return the set you want:

SELECT v.name
     , v.organization_id 
  FROM triton.organizations v
 WHERE v.organization_id (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
 ORDER BY v.organization_id

so just do a cross join of that set with the weeks rowsource:

  SELECT v.name
       , v.organization_id
       , w.weeknum
    FROM triton.organizations v
   CROSS
    JOIN weeks w
   WHERE v.organization_id (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
   ORDER
      BY v.organization_id
       , w.weeknum

If you only want a subset from weeks table, just add a predicate to the WHERE clause, e.g.

 AND w.weeknum BETWEEN 0 AND 104

That should return you every weeknum for every organization_id.

Once you get that working, now just add an "outer join" to your original query, adding an expression to the SELECT list that gets you a value to match to weeknum.

I'm confused about the expressions involving created_at. The value returned for created_at is indeterminate, due to the GROUP BY. If you want the "earliest" and/or "latest" values, use MIN and MAX aggregates. (Assuming created_at is a DATE, DATETIME or TIMESTAMP.)

  SELECT v.name
       , v.organization_id
       , w.weeknum
       , IFNULL(t.num_of_sessions,0)
    FROM triton.organizations v
   CROSS
    JOIN weeks w
    LEFT
    JOIN (
           -- query to get session counts goes here 
         ) t
      ON t.organization_id = v.organization_id
     AND t.weeknum         =  w.weeknum
   WHERE v.organization_id (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
   ORDER
      BY v.organization_id
       , w.weeknum

In the outer query, reference the num_of_sessions column returned by the view. The IFNULL function is a convenient way to replace a "missing" count with a zero.

The query to get the "counts" could be something like:

         SELECT s.organization_id    AS organization_id 
              , WEEK(s.created_at)   AS weeknum
              , COUNT(s.id)          AS num_of_sessions
              , MIN(s.created_at)    AS min_created_at
              , MAX(s.created_at)    AS max_created_at
           FROM triton.sessions s
          WHERE s.organization_id IN (17,19,20,21,24,25,26,27,29,31,32,33,34,25,36,37,38,39,40,41,43,44,45,46,47,48,49,50,51,52,53,56,57,58,59,60,62,69,70,72)
          GROUP
             BY s.organization_id
              , WEEK(s.created_at)

If the intent of this part of the original query:

  CONCAT(s.created_at, ' - ', s.created_at + INTERVAL 6 DAY) AS week

is to display the begin date and end date of the week, then generate that from the values from the weeks table. (If there's an organization where there's only one transaction in a given week, and it's on a thursday, that expression is going to generate a "thursday to following wednesday". Nothing wrong with doing that, but I strongly suspect that isn't what you really want.

If you want a "sunday to saturday" for each weeknum, it would be better to return that from the weeknum table.

If you want actual dates from organization sessions, then use MIN() and MAX() values of created_at, and concatenate those. Those won't necessarily be "sunday to saturday", but any dates returned would be "within" the week.

spencer7593
  • 106,611
  • 15
  • 112
  • 140