2

I have a table orders

  id  |  price  |  items_count  |         created_at
------+---------+---------------+------------------------
   1  |  16.50  |      3        | 2015-07-21 12:52:00.824
   2  |  50.00  |      1        | 2015-07-21 12:54:42.658
   3  |  12.00  |     10        | 2015-07-22 07:21:47.808
   .  |   .     |      .        |             .
   .  |   .     |      .        |             .
   .  |   .     |      .        |             .

and so on...

Now I want to get number of orders and sum of their prices from last month grouped by weekdays (as numbers from 0 to 6) and hours (as numbers from 0 to 23).

What I managed to do is:

SELECT EXTRACT('dow' FROM to_timezone(created_at, 'CEST', 0)) AS DAY,
(EXTRACT('hour' FROM to_timezone(created_at, 'CEST', 0)) + 0)::INT % 24 AS HOUR,
COUNT(DISTINCT id) AS orders_count,
SUM(price) AS price_total
FROM orders
WHERE created_at BETWEEN '2015-06-30 22:00:00.000000'
AND '2015-07-31 21:59:59.999999'
GROUP BY extract('dow' from to_timezone(created_at, 'CEST', 0)),
EXTRACT('hour' from to_timezone(created_at, 'CEST', 0))
ORDER BY 1 ASC, EXTRACT('hour' from to_timezone(created_at, 'CEST', 0));

And here is what I get - current result:

 day | hour | orders_count | price_total
-----+------+--------------+-------------
   0 |    7 |           11 |    298.00
   0 |    9 |            8 |     64.00
   1 |    8 |            1 |     12.50
   1 |   12 |            3 |     69.00 
   2 |   10 |            2 |    112.00
   2 |   13 |            1 |    100.00
   2 |   14 |           13 |   2163.70
   2 |   21 |            4 |    357.00

and so on...

Now I want to include all hours in a day and put 0 in orders_count and price_total columns if there were no orders in given hour. So my desired result should look like this:

 day | hour | orders_count | price_total
-----+------+--------------+-------------
   0 |    0 |            0 |      0.00
   0 |    1 |            0 |      0.00
   0 |    2 |            0 |      0.00
   0 |    3 |            0 |      0.00
   0 |    4 |            0 |      0.00
   0 |    5 |            0 |      0.00
   0 |    6 |            0 |      0.00
   0 |    7 |           11 |    298.00
   0 |    8 |            0 |      0.00
   0 |    9 |            8 |     64.00
   0 |   10 |            0 |      0.00

and so on - I need 7 days [0,6] and for each day 24 hours [0,23] with 0 in aggregate columns.


I came up with an idea to use generate_series to get all hours in a day:

SELECT EXTRACT(hour from generate_series)
FROM generate_series('2015-07-01 00:00'::timestamp, '2015-07-01 23:00', '1 hour');

My problem is that I don't know how to combine my current result with all hours. What kind of JOIN should I use? And how to insert 0s in previously mentioned columns when there are no orders for a given row?

Walerian Sobczak
  • 817
  • 2
  • 10
  • 23

1 Answers1

1
  1. Form a Cartesian product of days and hours with a CROSS JOIN.
  2. LEFT JOIN to the aggregated sums per (day, hour) - conveniently with the USING clause to fold columns into one instance.

Joins in the manual.

SELECT day, hour
     , COALESCE(orders_count, 0)  AS orders_count
     , COALESCE(price_total, 0.0) AS price_total
FROM         generate_series(0,6)  day
CROSS  JOIN  generate_series(0,23) hour
LEFT   JOIN (
   SELECT EXTRACT('dow'  FROM created_at AT TIME ZONE 'CEST')::int AS day
        , EXTRACT('hour' FROM created_at AT TIME ZONE 'CEST')::int AS hour
        , count(DISTINCT id) AS orders_count
        , sum(price)         AS price_total
   FROM   orders
   WHERE  created_at BETWEEN '2015-06-30 22:00:00.000000'
                         AND '2015-07-31 21:59:59.999999'
   GROUP  BY 1, 2
   ) o USING (day, hour)
ORDER  BY 1, 2;

Also cleaned up your query a bit.
Using the standard SQL construct AT TIME ZONE instead of the Postgres function to_timezone(). Details:

Removed the redundant modulo operator % 24.

Use COALESCE() to replace NULL with 0 or 0.0.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228