0

I have records with a start and end date, like so:

id  start_date   end_date
1   2016-01-01   2016-10-31
2   2016-06-01   2016-12-31
3   2016-06-01   2016-07-31

I have to know the number of records that were active per month (or better put: on the first day of all months in a given period). The counts would look like this when calculated for 2016:

jan: 1
feb: 1
mar: 1
apr: 1
may: 1
jun: 3
jul: 3
aug: 2
sep: 2
oct: 2
nov: 1
dec: 1

The solution I came up with, is to create a TEMP TABLE with all applicable dates for the given period:

date
2016-01-01
2016-02-01
...

Which makes the query very easy:

SELECT
  COUNT(*),
  m.date
FROM
  months m
INNER JOIN table t
  ON m.date BETWEEN t.start_date AND t.end_date
GROUP BY
  m.date

This produces exactly the results I'm looking for. However; I do feel as if this could be done easier. I just don't know how.

Any suggestions?

Sherlock
  • 7,525
  • 6
  • 38
  • 79
  • Have a physical calendar table. Your current approach looks best to me though it will not include the missing months. If you need missing month in result then change it to `left join` and count `table t` – Pரதீப் Nov 02 '16 at 17:14
  • There won't be missing months, as all months for the given period will be inserted in the temp table. I don't think a physical calendar table would really make things better than they are in the current approach, but thanks for your comment. – Sherlock Nov 02 '16 at 17:23
  • additionally, yeah creating the months manually in a temp table might work ***now***, but what happens the next time you need to do the same thing? You're going to copy paste the temporary table creation code into the new function? You'll likely want a calendar table for a more maintainable solution. – Kritner Nov 02 '16 at 18:11
  • Sorry, but I still don't see any missing months. What is missing in which table? – Sherlock Nov 02 '16 at 18:14
  • Yeah ok, you mean a table containing all dates from now until whenever, so I can join on that existing table and simply select all dates where DAY(date) = 1. Maybe that's a good idea. – Sherlock Nov 02 '16 at 18:26
  • @Sherlock exactly, there are already scripts out there like http://stackoverflow.com/a/5635628/2312877 as an example that will create them for you, even with additional information like is the date a holiday and the like. Some already have the individual date parts separated out for easier querying (and not having to use a function on your where clause) among other things. – Kritner Nov 02 '16 at 18:29
  • @Sherlock You can also make a static table containing 1-12 - and replace the `select / union` part in my query with a select from that static table. – dognose Nov 02 '16 at 18:29

2 Answers2

0

You can do it the following way, even if it looks ugly:

Assuming you want to run a report and you are only interested in "months of a certain year", the following query might work:

select m,Count(id) FROM (
SELECT 1 as m UNION 
SELECT 2 as m UNION 
SELECT 3 as m UNION 
SELECT 4 as m UNION 
SELECT 5 as m UNION 
SELECT 6 as m UNION 
SELECT 7 as m UNION 
SELECT 8 as m UNION 
SELECT 9 as m UNION 
SELECT 10 as m UNION 
SELECT 11 as m UNION 
SELECT 12 as m) AS tabseq
CROSS JOIN x WHERE 
  (year (start_date) = 2016 AND year (end_date) = 2016 AND m >= month(start_date) AND m <= month(end_date)) -- starts abd ends this year
  or
  (year (start_date) < 2016 AND year (end_date) = 2016 AND m <= month(end_date)) -- ends this year, consider months until end of contract
  or
  (year (start_date) < 2016 AND year (end_date) > 2016) -- spans the year, ignore month,
  or
  (year (start_date) = 2016 AND year (end_date) > 2016 AND m >= month(start_date)) -- starts this year, consider months until end of year
GROUP BY m;

result:

m   count(id)
1   1
2   1
3   1
4   1
5   1
6   3
7   3
8   2
9   2
10  2
11  1
12  1
dognose
  • 20,360
  • 9
  • 61
  • 107
0

As suggested in the comments, I replaced the temp table with a permanent table called 'calendar'.

CREATE TABLE `calendar` (
  `date` date NOT NULL,
  PRIMARY KEY (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

I filled this table with all dates from 2000-01-01 until 2100-12-31. I rewrote my query to this:

SELECT
  COUNT(*),
  c.date
FROM
  calendar c
INNER JOIN table t
  ON c.date BETWEEN t.start_date AND t.end_date
WHERE
  DAYOFMONTH(c.date) = 1
AND
  c.date BETWEEN '2016-01-01' AND '2016-12-31'
GROUP BY
  c.date
Sherlock
  • 7,525
  • 6
  • 38
  • 79