9

I have been trying to generate a series of dates (YYYY-MM-DD HH) from the first until the last date in a timestamp field. I've got the generate_series() I need, however running into an issue when trying to grab the start and end dates from a table. I have the following to give a rough idea:

with date1 as
(
SELECT start_timestamp as first_date
FROM header_table
ORDER BY start_timestamp DESC
LIMIT 1
),
date2 as
(
SELECT start_timestamp as first_date
FROM header_table
ORDER BY start_timestamp ASC    
LIMIT 1
)
    select generate_series(date1.first_date, date2.first_date
                         , '1 hour'::interval)::timestamp as date_hour

from
(   select * from date1
    union
    select * from date2) as foo

Postgres 9.3

SiriusBits
  • 760
  • 2
  • 10
  • 26

3 Answers3

28

You don't need a CTE for this, that would be more expensive than necessary.
And you don't need to cast to timestamp, the result already is of data type timestamp when you feed timestamp types to generate_series(). Details here:

In Postgres 9.3 or later you can use a LATERAL join:

SELECT to_char(ts, 'YYYY-MM-DD HH24') AS formatted_ts
FROM  (
   SELECT min(start_timestamp) as first_date
        , max(start_timestamp) as last_date
   FROM   header_table
   ) h
  , generate_series(h.first_date, h.last_date, interval '1 hour') g(ts);

Optionally with to_char() to get the result as text in the format you mentioned.

This works in any Postgres version:

SELECT generate_series(min(start_timestamp)
                     , max(start_timestamp)
                     , interval '1 hour') AS ts
FROM   header_table;

Typically a bit faster.
Calling set-returning functions in the SELECT list is a non-standard-SQL feature and frowned upon by some. Also, there were behavioral oddities (though not for this simple case) that were eventually fixed in Postgres 10. See:

Note a subtle difference in NULL handling:

The equivalent of

max(start_timestamp)

is obtained with

ORDER BY start_timestamp DESC NULLS LAST
LIMIT 1

Without NULLS LAST NULL values come first in descending order (if there can be NULL values in start_timestamp). You would get NULL for last_date and your query would come up empty.

Details:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks for the thorough help, definitely appreciate it. – SiriusBits Mar 13 '15 at 14:46
  • I accepted this as the right answer after evaluating Gordon Linoff and Used_By_Already. Their answers where great, and work nicely. However this shaved about 20-30ms off the time. In a huge system running almost 24x7 on hundreds of tables, this will make a difference (albeit however small). – SiriusBits Mar 13 '15 at 15:19
  • 1
    In the first code example what is the `g(ts);` referring to? – Ryder Brooks Jan 17 '18 at 19:43
  • 1
    @RyderBrooks: `g(ts)` are table and column alias (omitting the optional keyword `AS`) . See: https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES – Erwin Brandstetter Jan 18 '18 at 13:53
6

How about using aggregation functions instead?

with dates as (
      SELECT min(start_timestamp) as first_date, max(start_timestamp) as last_date
      FROM header_table
     )
select generate_series(first_date, last_date, '1 hour'::interval)::timestamp as date_hour
from dates;

Or even:

select generate_series(min(start_timestamp),
                       max(start_timestamp),
                       '1 hour'::interval
                      )::timestamp as date_hour
from header_table;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
2

try this:

with dateRange as
  (
  SELECT min(start_timestamp) as first_date, max(start_timestamp) as last_date
  FROM header_table
  )
select 
    generate_series(first_date, last_date, '1 hour'::interval)::timestamp as date_hour
from dateRange

NB: You want the 2 dates in a row, not on separate rows.

see this sqlfiddle demo

Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51