1

I've been working on this for a few hours with no luck and have hit a wall. My data looks like this:

Date1          Date2
2012-05-06     2012-05-05
2012-03-20     2012-01-05

What I'm trying to do is add 1 to the count for every month between two dates. So my output would ideally look like this:

Year    Month    Sum
2012    2        1

In other words, it should check for "empty" months between two dates and add 1 to them.

This is the code I've worked out so far. It will basically count the number of months between the two dates and group them into months and years.

SELECT 
    EXTRACT(YEAR FROM Date2::date) as "Year", 
    EXTRACT(MONTH FROM Date2::date) as "Month",
    SUM(DATE_PART('year', Date1::date) - DATE_PART('year', Date2::date)) * 12 +
    (DATE_PART('month', Date1::date) - DATE_PART('month', Date2::date))
FROM
    test
GROUP BY 
    "Year", 
    "Month",
ORDER BY
    "Year" DESC, 
    "Month" DESC;

This is where I'm stuck - I don't know how to actually add 1 for each of the "empty" months.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Philip
  • 75
  • 1
  • 7

3 Answers3

3

Test setup

With some sample rows (should be provided in the question):

CREATE TABLE test (
   test_id serial PRIMARY KEY
 , date1   date NOT NULL
 , date2   date NOT NULL
);

INSERT INTO test(date1, date2)
VALUES
   ('2012-03-20', '2012-01-05')  -- 2012-02 lies in between
 , ('2012-01-20', '2012-03-05')  -- 2012-02 (reversed)
 , ('2012-05-06', '2012-05-05')  -- nothing
 , ('2012-05-01', '2012-06-30')  -- still nothing
 , ('2012-08-20', '2012-11-05')  -- 2012-09 - 2012-10
 , ('2012-11-20', '2013-03-05')  -- 2012-12 - 2013-02
;

Postgres 9.3 or newer

Use a LATERAL join:

SELECT to_char(mon, 'YYYY') AS year
     , to_char(mon, 'MM')   AS month
     , count(*) AS ct
FROM  (
   SELECT date_trunc('mon',    least(date1, date2)::timestamp) + interval '1 mon' AS d1
        , date_trunc('mon', greatest(date1, date2)::timestamp) - interval '1 mon' AS d2
   FROM   test
   ) sub1
 , generate_series(d1, d2, interval '1 month') mon  -- implicit CROSS JOIN LATERAL
WHERE  d2 >= d1 -- exclude ranges without gap right away
GROUP  BY mon
ORDER  BY mon;

Postgres 9.2 or older

No LATERAL, yet. Use a subquery instead:

SELECT to_char(mon, 'YYYY') AS year
     , to_char(mon, 'MM')   AS month
     , count(*) AS ct
FROM  (
   SELECT generate_series(d1, d2, interval '1 month') AS mon
   FROM  (
      SELECT date_trunc('mon',    least(date1, date2)::timestamp) + interval '1 mon' AS d1
           , date_trunc('mon', greatest(date1, date2)::timestamp) - interval '1 mon' AS d2
      FROM   test
      ) sub1
   WHERE  d2 >= d1 -- exclude ranges without gap right away
   ) sub2
GROUP  BY mon
ORDER  BY mon;

Result

 year | month | ct
------+-------+----
 2012 |     2 |  2
 2012 |     9 |  1
 2012 |    10 |  1
 2012 |    12 |  1
 2013 |     1 |  1
 2013 |     2 |  1

db<>fiddle here
SQL Fiddle.

Explanation

You are looking for complete calendar months between the two dates.

These queries work with any dates or timestamps in ascending or descending order and should perform well.

The WHERE clause is optional, since generate_series() returns no row if start > end. But it should be a bit faster to exclude empty ranges a priori.

The cast to timestamp makes it a bit cleaner and faster. Rationale:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thank you that works amazingly! Sorry for not being clear earlier. Is there any way of not having to input all the dates into the query? I have thousands of rows with these dates so ideally I would need the query to read them directly from the table. Thank you again! – Philip Jul 27 '12 at 07:37
  • My list of dates is also not in order so I guess that makes it more complicated? I've tried adapting your query so that it can just take two column names as input values but can't get it to work. – Philip Jul 27 '12 at 11:20
  • Ah I finally got it to work :) Thanks for your input, much appreciated! Here's what the query looks like: http://pastebin.com/gHpVtPZe – Philip Jul 27 '12 at 12:41
  • @Philip: I see you figured it out: just remove the first CTE and replace `x` with your table name. – Erwin Brandstetter Jul 27 '12 at 13:00
0

AFAIK you can simply substract/add dates in postgresql

'2001-06-27 14:43:21'::DATETIME - '2001-06-27 14:33:21'::DATETIME = '00:10:00'::INTERVAL

So in your case that request part should look like

DATE_PART('month', Date1::datetime - Date2::datetime) as "MonthInterval"
filimonov
  • 1,666
  • 1
  • 9
  • 20
  • Sure, I tried that but I have dates in several years so subtracting a date in December from a date in January will return a negative number. My main problem here is adding a number to each "emtpy" month :) – Philip Jul 26 '12 at 16:17
  • @Philip: I honestly do not understand what you are trying to achieve. If you cast the date values to timestamp, you will get the difference in days; if the left date is earlier then the right date, result is always positive. – martin Jul 26 '12 at 17:06
  • Sorry about that Martin, it made sense when I wrote it. Erwin (above) understood what I meant after some initial trouble it seems. Thanks anyways! – Philip Jul 27 '12 at 11:21
0

age(timestamp1, timestamp2) => returns interval

the we try to extract year and month out of the interval and add them accordingly.

select extract(year from age(timestamp1, timestamp2))*12 + extract(month from age(timestamp1, timestamp2))

Angelin Nadar
  • 8,944
  • 10
  • 43
  • 53