postgresql max(count(*)) - php

Question

I have a problem in postgresql. I have one cohorte (gathering of people) and i would like counting the persons in this cohorte.

Begin date : "2014-09-01", End date : "2014-11-30".

I have 5 persons between 09/01 and 09/22
I have 5 persons between 09/20 and 09/25
I have 5 persons between 09/26 and 10/05
I have 5 persons between 10/01 ans 11/30

I want to have the max of accommodation for each month between the begin date and the end date in SQL (or PHP). Expected max person count:

September(09) => 10
October(10) => 10
November(11) => 5

tables : person, contrat and one person = one contrat with one daterange (arrival and departure date) per contrat — user3775877, Jun 25 '14 at 15:58
Question title is not representative. Counting based on several columns? — mlt, Jun 25 '14 at 16:57
@ErwinBrandstetter I think 10 is still correct as at any given day in September it does not exceed 10. — mlt, Jun 25 '14 at 18:39
@mlt: Yes, I later realized the intent of the OP and put it on top of my answer to clarify. Removing my previous comment. — Erwin Brandstetter, Jun 25 '14 at 19:16

score 1 · Answer 1 · edited May 23 '17 at 12:20

Find the maximum of simultaneously present persons on a single day for every month in a given period.

I suggest generate_series() to produce the series of days in your period. Then aggregate twice:

First to get a count for each day. A single day can be dealt with plain BETWEEN. Your ranges are obviously meant to be with include borders.
Second to get the maximum per month.

SELECT date_trunc('month', day)::date AS month, max(ct) AS max_ct
FROM  (
   SELECT g.day, count(*) AS ct
   FROM   cohorte
         ,generate_series('2014-09-01'::date  -- first of Sept.
                         ,'2014-11-30'::date  -- last of Nov.
                         ,'1 day'::interval) g(day)
   WHERE  g.day BETWEEN t_begin AND t_end
   GROUP  BY 1
   ) sub
GROUP  BY 1
ORDER  BY 1;

Returns:

month      |  max_ct
-----------+--------
2014-09-01 | 10
2014-10-01 | 10
2014-11-01 | 5

Use to_char() to prettify the month output.

SQL Fiddle .. is down ATM. Here is my test case (that you should have provided):

CREATE TEMP TABLE cohorte (
   cohorte_id serial PRIMARY KEY
  ,person_id  int  NOT NULL
  ,t_begin    date NOT NULL  -- inclusive
  ,t_end      date NOT NULL  -- inclusive
);

INSERT INTO cohorte(person_id, t_begin, t_end)
SELECT g, '2014-09-01'::date, '2014-09-22'::date
FROM   generate_series (1,5) g
UNION ALL
SELECT g+5, '2014-09-20', '2014-09-25'
FROM   generate_series (1,5) g
UNION ALL
SELECT g+10, '2014-09-26', '2014-10-05'
FROM   generate_series (1,5) g
UNION ALL
SELECT g+15, '2014-10-01', '2014-11-30'
FROM   generate_series (1,5) g;

For more complex checks I'd suggest the OVERLAPS operator:
Find overlapping date ranges in PostgreSQL

For more complex scenarios I'd also consider range types:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL

CircleCode · Answer 2 · 2014-06-26T11:58:05.673

can't you use window function? I'd try something like this (I've not tested this code, just exposed my thoughts)

SELECT max(count) FROM (
  SELECT count(*) OVER (PARTITION BY ???) as count
  FROM contract
  WHERE daterange(dateStart, dateEnd, '[]') && daterange('2014-09-01', '2014-10-01', '[)')
) as max

Here, my problem remains that I can't find a way to partition for each day of the interval. Maybe this is a wrong approach, but I would be interested by a solution based on windows.

edit: with this request, you have the max of simultaneous present, but over all the time, not only a given month

with presence as (
  SELECT id, generate_series(begin_date, end_date, '1 day'::interval) AS date
  FROM test
),
presents as (
  SELECT count(*) OVER (PARTITION BY date) AS count
  FROM presence
)
SELECT max(count) from presents;

Here we come, I think

Imagine your person table has 3 columns :

id
entrance_date
leaving_date

the request would look like

WITH presents as (
  SELECT id, 
         daterange(entrance_date, leaving_date, '[]') * daterange('2014-09-01', '2014-11-30', '[]') as range
  FROM person
  WHERE daterange(entrance_date, leaving_date, '[]') && daterange('2014-09-01', '2014-11-30', '[]')
),
present_per_day as (
  SELECT id,
         generate_series(lower(range), upper(range), '1 day'::interval) AS date
  FROM presents
),
count_per_day as (
  SELECT count(*) OVER (PARTITION BY date) AS count,
         date
  FROM present_per_day
),
SELECT max(count) OVER (PARTITION BY date_part('year', date), date_part('month', date)) as max,
       date_part('year', date),
       date_part('month', date)
FROM count_per_day;

(I have to leave, I hope I'll have time to test it later)

In fact, @erwin solution is much much more easy and efficient than this one.

Thanks, I thought about this solution but i never used the window function so i don't know which partition we have to use... — user3775877, Jun 25 '14 at 16:38
@circlecode OP has overlaps as it is not a single date column but 2. It would work if all dates were in a single column and a flag was used to indicate arrival/departure. A head on approach would, perhaps, involve union. [Unpivoting](http://stackoverflow.com/q/1128737/673826) with unnest might help. — mlt, Jun 25 '14 at 16:52
And one usually uses EXTRACT (or PG specific equivalent) to fetch month from date. — mlt, Jun 25 '14 at 16:58

postgresql max(count(*)) - php

2 Answers2