0

I have a table with userID, clockin(1)/Clockout(0), dateTime for few employees. in/out shows when someone is on (1) or off(0) clock.

  1. Job shift can span across midnight, as in punch in before midnight, and punchout in the morning. (Eg: Date 21st in table)
  2. Shift can last more than 24 hours (hypothetically) (Eg : Date 24)
  3. Punchin and punchout can happen multiple times within 24 hrs as well(Eg : Date 22).

I would like to get the sum of hours worked per day for any given user_id but within midnight to midnight, even though the shift might span across midnight. Timestamps are shown all with :30:00 for clarity. Only one user_id is shown, but this table can have info from multiple users, so user_id will be used in the where clause.

    [id]   [User_id]           [Date_time]                 [in_out]
        1       1          2022-08-20 09:30:00                  1
        2       1          2022-08-20 21:30:00                  0
        3       1          2022-08-21 20:30:00                  1
        4       1          2022-08-22 08:30:00                  0
        5       1          2022-08-22 09:30:00                  1
        6       1          2022-08-22 14:30:00                  0
        7       1          2022-08-23 12:30:00                  1
        8       1          2022-08-25 09:30:00                  0
        9       1          2022-08-25 12:30:00                  1

So The desired query result would be something like below. The formatting does not matter. Total time per day in seconds or minutes or anything will work.

 [Day]                 [hours_worked]
2022-08-20                12:00:00
2022-08-21                03:30:00
2022-08-22                13:00:00
2022-08-23                11:30:00
2022-08-24                24:00:00
2022-08-25                09:30:00

I started with the code from Get total hours worked in a day mysql This works well when punch-in happens before punch outs in a day, and does not handle midnights. Just trying to adapt to the specific case. Any help much appreciated.

aVC
  • 2,254
  • 2
  • 24
  • 46
  • Needs further clarification. Are `id`'s unique? You have two rows with `id=8`. Is `Date_time` guaranteed to be monotonic when rows are sorted by `id`? Is the situation possible where the user has two punch ins or punch outs in a row when records are sorted by `Date_time`? If yes, how should this be taken care of? – bazzilic Aug 26 '22 at 06:09
  • @bazzilic Sorry, it was a typo. id is unique. Yes it is possible to have multiple punch-ins, outs in a given day. – aVC Aug 26 '22 at 15:28

1 Answers1

0

To do this in MySQL 5.6, I can only think of a not so nice query, but let's create the data first

CREATE TABLE events
    (`id` int, `User_id` int, `Date_time` datetime, `in_out` int);

INSERT INTO events
    (`id`, `User_id`, `Date_time`, `in_out`)
VALUES
    (1, 1, '2022-08-20 09:30:00', 1),
    (2, 1, '2022-08-20 21:30:00', 0),
    (3, 1, '2022-08-21 20:30:00', 1),
    (4, 1, '2022-08-22 08:30:00', 0),
    (5, 1, '2022-08-22 09:30:00', 1),
    (6, 1, '2022-08-22 14:30:00', 0),
    (7, 1, '2022-08-23 12:30:00', 1),
    (8, 1, '2022-08-25 09:30:00', 0),
    (9, 1, '2022-08-25 12:30:00', 1);

Based on https://stackoverflow.com/a/60173743/19657183, one can get the dates for every single day between the first and last event date. Then, you can JOIN the result with the events to figure out the ones which overlap. From that you can calculate the time differences and sum them up grouped by day:

SELECT User_id, start_of_day, 
       sec_to_time(sum(timestampdiff(SECOND, CAST(GREATEST(cast(start_of_day as datetime), prev_date_time) AS datetime),
           CAST(LEAST(start_of_next_day, Date_time) AS datetime)))) AS diff
  FROM (
    SELECT * FROM (
        SELECT id, User_id, 
               CASE WHEN @puid IS NULL or @puid <> User_id THEN NULL ELSE @pdt END AS prev_date_time, @pdt := Date_time AS Date_time, 
               CASE WHEN @puid IS NULL or @puid <> User_id THEN NULL ELSE @pio END AS prev_in_out, @pio := in_out in_out,
               @puid := User_id
          FROM (SELECT * FROM events ORDER BY User_id, Date_time) e, 
               (SELECT @pdt := '1970-01-01 00:00:00', @pio := NULL, @puid := NULL) init ) tr
     WHERE prev_in_out = 1 and in_out = 0) event_ranges
    JOIN (
    SELECT @d start_of_day, 
           @d := date_add(@d, interval 1 day) start_of_next_day
      FROM (SELECT @d := date(min(Date_time)) FROM events) min_d,
           (SELECT x1.N + x10.N*10 + x100.N*100 + x1000.N*1000 
              FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x1,
                   (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x10,
                   (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x100,
                   (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x1000
             WHERE x1.N + x10.N*10 + x100.N*100 + x1000.N*1000 <= (SELECT date(max(Date_time)) - date(min(Date_time)) FROM events)) days_off) day_ranges
    ON prev_date_time < start_of_next_day AND Date_time >= start_of_day
 GROUP BY User_id,start_of_day;

I encountered a problem using sqlfiddle.com: it returned 00:00:00 if e.g. the time difference was exactly 24 hours (didn't matter if I used timediff or sec_to_time). I haven't seen this problem neither on MySQL 8 nor in db-fiddle.com (using MySQL 5.6). So, it might be, that you've to work around this problem.

EDIT: rewrote completely to solve the problem in MySQL 5.6 as requested by the OP.

EDIT #2: Updated the query to take sorting and grouping into account.

EDIT #3: changed initial assignment of the variables.

Mihe
  • 2,270
  • 2
  • 4
  • 14
  • Thanks much. Is there a version of this that can be used for 5.6? Also, the fiddle is grouping by userID, whereas Ideally I need a day based sum table. – aVC Aug 25 '22 at 20:46
  • Oh, I've to apologize that I didn't take the correct grouping into account. I've updated my answer. – Mihe Aug 26 '22 at 06:06
  • Thanks a ton for looking into this, I really appreciate the thought and time being put into this. I will try this out and update you. – aVC Aug 26 '22 at 15:26
  • That would be nice. One more thing: if you need the total for all users, just use this query as a subquery, `select start_of_day, sum(diff) from (... the query ...) x group by start_of_day`. – Mihe Aug 26 '22 at 15:40
  • Not sure If I am doing something wrong, threw a fiddle but nothing comes out. https://dbfiddle.uk/?rdbms=mysql_5.6&fiddle=a05de066cc288423b7411c604e642d73 – aVC Aug 27 '22 at 14:57
  • Not sure, what happened, perhaps I've pasted the wrong statement. I've updated the fiddle but don't know if you can see the changes, therefore I've updated the statement in my answer, too. I hope, it's correct, now. – Mihe Aug 27 '22 at 16:57