-1

I have a problem in postgresql. I have one cohorte (gathering of people) and i would like counting the persons in this cohorte.

Begin date : "2014-09-01", End date : "2014-11-30".

  • I have 5 persons between 09/01 and 09/22
  • I have 5 persons between 09/20 and 09/25
  • I have 5 persons between 09/26 and 10/05
  • I have 5 persons between 10/01 ans 11/30

I want to have the max of accommodation for each month between the begin date and the end date in SQL (or PHP). Expected max person count:

  • September(09) => 10
  • October(10) => 10
  • November(11) => 5
Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109

2 Answers2

1

Find the maximum of simultaneously present persons on a single day for every month in a given period.

I suggest generate_series() to produce the series of days in your period. Then aggregate twice:

  • First to get a count for each day. A single day can be dealt with plain BETWEEN. Your ranges are obviously meant to be with include borders.

  • Second to get the maximum per month.

SELECT date_trunc('month', day)::date AS month, max(ct) AS max_ct
FROM  (
   SELECT g.day, count(*) AS ct
   FROM   cohorte
         ,generate_series('2014-09-01'::date  -- first of Sept.
                         ,'2014-11-30'::date  -- last of Nov.
                         ,'1 day'::interval) g(day)
   WHERE  g.day BETWEEN t_begin AND t_end
   GROUP  BY 1
   ) sub
GROUP  BY 1
ORDER  BY 1;

Returns:

month      |  max_ct
-----------+--------
2014-09-01 | 10
2014-10-01 | 10
2014-11-01 | 5

Use to_char() to prettify the month output.

SQL Fiddle .. is down ATM. Here is my test case (that you should have provided):

CREATE TEMP TABLE cohorte (
   cohorte_id serial PRIMARY KEY
  ,person_id  int  NOT NULL
  ,t_begin    date NOT NULL  -- inclusive
  ,t_end      date NOT NULL  -- inclusive
);

INSERT INTO cohorte(person_id, t_begin, t_end)
SELECT g, '2014-09-01'::date, '2014-09-22'::date
FROM   generate_series (1,5) g
UNION ALL
SELECT g+5, '2014-09-20', '2014-09-25'
FROM   generate_series (1,5) g
UNION ALL
SELECT g+10, '2014-09-26', '2014-10-05'
FROM   generate_series (1,5) g
UNION ALL
SELECT g+15, '2014-10-01', '2014-11-30'
FROM   generate_series (1,5) g;

For more complex checks I'd suggest the OVERLAPS operator:
Find overlapping date ranges in PostgreSQL

For more complex scenarios I'd also consider range types:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

can't you use window function? I'd try something like this (I've not tested this code, just exposed my thoughts)

SELECT max(count) FROM (
  SELECT count(*) OVER (PARTITION BY ???) as count
  FROM contract
  WHERE daterange(dateStart, dateEnd, '[]') && daterange('2014-09-01', '2014-10-01', '[)')
) as max

Here, my problem remains that I can't find a way to partition for each day of the interval. Maybe this is a wrong approach, but I would be interested by a solution based on windows.


edit: with this request, you have the max of simultaneous present, but over all the time, not only a given month

with presence as (
  SELECT id, generate_series(begin_date, end_date, '1 day'::interval) AS date
  FROM test
),
presents as (
  SELECT count(*) OVER (PARTITION BY date) AS count
  FROM presence
)
SELECT max(count) from presents;

Here we come, I think

Imagine your person table has 3 columns :

  • id
  • entrance_date
  • leaving_date

the request would look like

WITH presents as (
  SELECT id, 
         daterange(entrance_date, leaving_date, '[]') * daterange('2014-09-01', '2014-11-30', '[]') as range
  FROM person
  WHERE daterange(entrance_date, leaving_date, '[]') && daterange('2014-09-01', '2014-11-30', '[]')
),
present_per_day as (
  SELECT id,
         generate_series(lower(range), upper(range), '1 day'::interval) AS date
  FROM presents
),
count_per_day as (
  SELECT count(*) OVER (PARTITION BY date) AS count,
         date
  FROM present_per_day
),
SELECT max(count) OVER (PARTITION BY date_part('year', date), date_part('month', date)) as max,
       date_part('year', date),
       date_part('month', date)
FROM count_per_day;

(I have to leave, I hope I'll have time to test it later)

In fact, @erwin solution is much much more easy and efficient than this one.

CircleCode
  • 107
  • 8
  • Thanks, I thought about this solution but i never used the window function so i don't know which partition we have to use... – user3775877 Jun 25 '14 at 16:38
  • @circlecode OP has overlaps as it is not a single date column but 2. It would work if all dates were in a single column and a flag was used to indicate arrival/departure. A head on approach would, perhaps, involve union. [Unpivoting](http://stackoverflow.com/q/1128737/673826) with unnest might help. – mlt Jun 25 '14 at 16:52
  • And one usually uses EXTRACT (or PG specific equivalent) to fetch month from date. – mlt Jun 25 '14 at 16:58