How to calculate time outside of work hours

Question

This seemed pretty straight forward initially, but has proved to be a real headache. Below is my table, data, expected output and SQL Fiddle of where I have got to in solving my problem.

Schema & Data:

CREATE TABLE IF NOT EXISTS `meetings` (
  `id` int(6) unsigned NOT NULL,
  `user_id` int(6) NOT NULL,
  `start_time` DATETIME,
  `end_time` DATETIME,
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `meetings` (`id`, `user_id`, `start_time`, `end_time`) VALUES
  ('0', '1', '2018-05-09 04:30:00', '2018-05-09 17:30:00'),
  ('1', '1', '2018-05-10 06:30:00', '2018-05-10 17:30:00'),
  ('2', '1', '2018-05-10 12:30:00', '2018-05-10 16:00:00'),
  ('3', '1', '2018-05-11 17:00:00', '2018-05-12 11:00:00'),
  ('4', '2', '2018-05-11 07:00:00', '2018-05-12 11:00:00'),
  ('5', '2', '2018-05-11 04:30:00', '2018-05-11 15:00:00');

What I would like to get from the above is total time worked outside of 09:00 to 17:00, grouped by day and user_id. So the result from the above data would look like:

  date        | user_id | overtime_hours
  ---------------------------------------
  2018-05-09  | 1       | 05:00:00
  2018-05-10  | 1       | 03:00:00
  2018-05-11  | 1       | 07:00:00
  2018-05-12  | 1       | 09:00:00
  2018-05-11  | 2       | 13:30:00
  2018-05-12  | 2       | 09:00:00

As you can see the expected results are only summing overtime for each day and user for those hours outside of 9 to 5.

Below is the query and SQL Fiddle of where I am. The main issue comes when the start and ends straddle midnight (or multiple midnight's)

SELECT
    SEC_TO_TIME(SUM(TIME_TO_SEC(TIME(end_time)) - TIME_TO_SEC(TIME(start_time)))), user_id, DATE(start_time)
FROM
(SELECT 
    start_time, CASE WHEN TIME(end_time) > '09:00:00' THEN DATE_ADD(DATE(end_time), INTERVAL 9 HOUR) ELSE end_time END AS end_time, user_id
FROM
    meetings
WHERE
    TIME(start_time) < '09:00:00'

UNION

SELECT 
    CASE WHEN TIME(start_time) < '17:00:00' THEN DATE_ADD(DATE(start_time), INTERVAL 17 HOUR) ELSE start_time END AS start_time, end_time, user_id
FROM
    meetings
WHERE
    TIME(end_time) > '17:00:00') AS clamped_times
GROUP BY user_id, DATE(start_time)

http://sqlfiddle.com/#!9/77bc85/1

Pastebin for when the fiddle decides to flake: https://pastebin.com/1YvLaKbT

As you can see the query grabs the easy overtime with start and ends on the same day, but does not work with the multiple day ones.

Sounds like this could be accomplished by sorting on both day and user_id — Dom, Nov 19 '18 at 16:30
The fiddle isn't loading at the moment, so maybe you already have this, but if not, here's where I would start... `SELECT id,TIMEDIFF(DATE_FORMAT(start_time, '%Y-%m-%d 09:00:00'),start_time) n FROM meetings;` — Strawberry, Nov 19 '18 at 16:34
@Strawberry Yeah sqlfiddle has been playing up today something chronic. Try and let it time out and then reload the page. I don't want to bloat the question with the query. I'll add a paste bin link to it. — superphonic, Nov 19 '18 at 16:38
I think you just need to think about your CASE statements a little more carefully. — Strawberry, Nov 19 '18 at 16:46
@superphonic - sqlfiddle was messing up the other day as well when I tried it. Maybe their database has gotten corrupted ;). — dcp, Nov 19 '18 at 16:46

Madhur Bhaiya · Accepted Answer · 2018-11-24T14:08:17.893

If the meeting is going to span across n days, and you are looking to compute "work hours" daywise within a particular meeting; it rings a bell, that we can use a number generator table.

(SELECT 0 AS gap UNION ALL SELECT 1 UNION ALL SELECT 2) AS ngen

We will use the number generator table to consider separate rows for the individual dates ranging from the start_time to end_time. For this case, I have assumed that it is unlikely that meeting will span across more than 2 days. If it happens to span more number of days, you can easily extend the range by adding more UNION ALL SELECT 3 .. to the ngen Derived Table.

Based on this, we will determine "start time" and "end time" to consider for a specific "work date" in an ongoing meeting. This calculation is being done in a Derived Table, for a grouping of user_id and "work date".

Afterwards, we can SUM() up "working hours" per day of a user using some maths. Please find the query below. I have added extensive comments to it; do let me know if anything is still unclear.

Demo on DB Fiddle

Query #1

SELECT 
  dt.user_id, 
  dt.wd AS date, 

  SEC_TO_TIME(SUM(

      CASE 
        /*When both start & end times are less than 9am OR more than 5pm*/
        WHEN (st < TIME_TO_SEC('09:00:00') AND et < TIME_TO_SEC('09:00:00')) OR 
             (st > TIME_TO_SEC('17:00:00') AND et > TIME_TO_SEC('17:00:00'))
        THEN et - st  /* straightforward difference between the two times */

        /* atleast one of the times is in 9am-5pm block, OR, 
           start < 9 am and end > 5pm.
           Math of this can be worked out based on signum function */
        ELSE GREATEST(0, TIME_TO_SEC('09:00:00') - st) + 
             GREATEST(0, et - TIME_TO_SEC('17:00:00'))

      END
  )) AS working_hours  

FROM 
(

 SELECT 
   m.user_id, 

   /* Specific work date */
   DATE(m.start_time) + INTERVAL ngen.gap DAY AS wd, 

   /* Start time to consider for this work date */
   /* If the work date is on the same date as the actual start time
      we consider this time */
   CASE WHEN DATE(m.start_time) + INTERVAL ngen.gap DAY = DATE(m.start_time) 
             THEN TIME_TO_SEC(TIME(m.start_time))

        /* We are on the days after the start day */
        ELSE 0  /* 0 seconds (start of the day) */
   END AS st, 

   /* End time to consider for this work date */
   /* If the work date is on the same date as the actual end time
      we consider this time */
   CASE WHEN DATE(m.start_time) + INTERVAL ngen.gap DAY = DATE(m.end_time) 
             THEN TIME_TO_SEC(TIME(m.end_time)) 

        /* More days to come still for this meeting, 
           we consider the end of this day as end time */
        ELSE 86400  /* 24 hours * 3600 seconds (end of the day) */
   END AS et

 FROM meetings AS m 
 JOIN (SELECT 0 AS gap UNION ALL SELECT 1 UNION ALL SELECT 2) AS ngen
   ON DATE(start_time) + INTERVAL ngen.gap DAY <= DATE(end_time)

) AS dt 
GROUP BY dt.user_id, dt.wd;

Result

| user_id | date       | working_hours |
| ------- | ---------- | ------------- |
| 1       | 2018-05-09 | 05:00:00      |
| 1       | 2018-05-10 | 03:00:00      |
| 1       | 2018-05-11 | 07:00:00      |
| 1       | 2018-05-12 | 09:00:00      |
| 2       | 2018-05-11 | 13:30:00      |
| 2       | 2018-05-12 | 09:00:00      |

Further Optimization Possibilities:

This query can do away with the usage of subquery (Derived Table) very easily. I just wrote it in this way, to convey the mathematics and process in a followable manner. However, you can easily merge the two SELECT blocks to a single query.
Maybe, more optimization possible in usage of Date/Time functions, as well as further simplification of mathematics in it. Function details available at: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html
Some date calculations are done multiple times, e.g., DATE(m.start_time) + INTERVAL ngen.gap DAY. To avoid recalculation, we can utilize User-defined variables, which will also make the query less verbose.
Make this JOIN condition sargable: JOIN .. ON DATE(start_time) + INTERVAL ngen.gap DAY <= DATE(end_time)

Perfect answer. I also appreciate the extra effort of the of the commenting and explanation. Thanks — superphonic, Nov 26 '18 at 11:39
@superphonic thanks. This query can be condensed further, as mentioned by me in "Further Optimization Possibilities" at the end. I have left that part to you; if you still need assistance, do let me know. — Madhur Bhaiya, Nov 26 '18 at 11:40

How to calculate time outside of work hours

1 Answers1

Linked