0

I am looking for a solution to count days in a daterange per year. My table looks like this:

+----+-----------+------------+------------+
| id | source_id | start_date | end_date   |
+----+-----------+------------+------------+
|  1 |         1 | 2015-11-01 | 2017-01-31 |
+----+-----------+------------+------------+

Now I want to count the days in between. Its easy with DATEDIFF() in complete, but how to do it per year?

I tried a kind of temp. transformation into single rows to perform count and group actions:

+----+-----------+------------+------------+
| id | source_id | start_date | end_date   |
+----+-----------+------------+------------+
|  1 |         1 | 2015-11-01 | 2015-12-31 |
+----+-----------+------------+------------+
|  1 |         1 | 2016-01-01 | 2016-12-31 |
+----+-----------+------------+------------+
|  1 |         1 | 2017-01-01 | 2017-01-31 |
+----+-----------+------------+------------+

EDIT: The desired output should like that:

+-----------+------+------+
| source_id | year | days |
+-----------+------+------+
|         1 | 2015 |   60 |
+-----------+------+------+
|         1 | 2016 |  365 |
+-----------+------+------+
|         1 | 2017 |   30 |
+-----------+------+------+

So it become possible to summarize all days grouped by source_id and year.

Is there an easy way to do it in MySQL?

Daniel
  • 539
  • 4
  • 25

2 Answers2

1

Create another table that lists all the years:

CREATE TABLE years (
    year_start DATE,
    year_end DATE
);
INSERT INTO years VALUES 
    ('2015-01-01', '2015-12-31'), 
    ('2016-01-01', '2016-12-31'),
    ('2017-01-01', '2017-12-31');

Then you can join with this table

SELECT t.source_id, YEAR(y.year_start) AS year, DATEDIFF(LEAST(year_end, end_date), GREATEST(year_start, start_date)) AS day_count
FROM yourTable AS t
JOIN years AS y 
    ON y.year_start BETWEEN t.start_date AND t.end_date
    OR y.year_end BETWEEN t.start_date AND t.end_date

DEMO

If you don't want to create a real table, you can use a subquery that creates it on the fly:

SELECT t.source_id, YEAR(y.year_start) AS year, DATEDIFF(LEAST(year_end, end_date), GREATEST(year_start, start_date)) AS day_count
FROM yourTable AS t
JOIN (SELECT CAST('2015-01-01' AS DATE) AS year_start, CAST('2015-12-31' AS DATE) AS year_end
      UNION
      SELECT CAST('2016-01-01' AS DATE) AS year_start, CAST('2016-12-31' AS DATE) AS year_end
      UNION
      SELECT CAST('2017-01-01' AS DATE) AS year_start, CAST('2017-12-31' AS DATE) AS year_end
    ) AS y
    ON y.year_start BETWEEN t.start_date AND t.end_date
    OR y.year_end BETWEEN t.start_date AND t.end_date

DEMO

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thats one solution I found on Google. Is it possible to fill a temp. table years and use it directly in that join? – Daniel Nov 04 '15 at 20:23
  • You can create it on the fly in a subquery, I've added that to my answer. – Barmar Nov 04 '15 at 20:29
  • Thank you. Do I need an union for every year or is it also possible to do it dynamic for x years? – Daniel Nov 04 '15 at 20:32
  • Sadly I have to leave my seat and couldnt finish my answer. Here is the link to how create the year table Similar to http://stackoverflow.com/questions/2157282/generate-days-from-date-range. And instead using `between` here you will get how know if data range overlap http://stackoverflow.com/questions/325933/determine-whether-two-date-ranges-overlap – Juan Carlos Oropeza Nov 04 '15 at 20:32
  • You need an entry for every year. You can't create rows in a result if there is no corresponding row in one of the input tables. – Barmar Nov 04 '15 at 20:38
0

I found some other snippet and I combined both. Its more a working hack than a solution, but it works good enough for my purpose.

SELECT r.source_id, 
   YEAR(y.year_start) AS year,
   DATEDIFF(LEAST(year_end, end_date), GREATEST(year_start, start_date)) AS day_count,
   r.start_date,
   r.end_date
  FROM ranges AS r
  JOIN (
      SELECT @i:= @i + 1 AS YEAR,
             CAST(CONCAT(@i, '-01-01') AS DATE) AS year_start,
             CAST(CONCAT(@i, '-12-31') AS DATE) AS year_end
        FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY,
             (SELECT @i:= 1899) AS i
     ) AS y
    ON r.start_date >= y.year_start AND r.start_date <= y.year_end
    OR r.end_date >= y.year_start AND r.end_date <= y.year_end;

I think, the table INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY is just a workaround to do the iteration. Not nice, but maybe someone needs something like that.

Daniel
  • 539
  • 4
  • 25