MySQL - Average difference between timestamps, excluding weekends and out of business hours

Question

I'm looking for the ability to average the difference between timestamps, excluding weekends and excluding out of business hours (only between 08:00:00 - 17:00:00).

I'm trying to get this working using just a query but can fallback to a PHP function if it's not possible using MySQL

Below is the current function I am using to get the average timestamp differences.

Eg. The below query will return the difference between Friday 8am to Monday 5pm as 81 hours, it needs to return 18 hours instead, because it should exclude the weekends and weekdays out of office hours.

SQLFIDDLE LINK

SELECT 
    clients.name, 
    avg(TIMESTAMPDIFF(HOUR, jobs.time_created, jobs.time_updated)) AS average_response, 
    avg(TIMESTAMPDIFF(HOUR, jobs.time_created, jobs.time_closed)) AS average_closure, 
    count(jobs.id) AS ticket_count, 
    SUM(time_total) AS time_spent 
FROM 
    jobs
LEFT JOIN 
    clients ON jobs.client = clients.id 
WHERE 
    jobs.status = 'closed' 
GROUP BY 
    jobs.client

I've looked at other questions but they don't seem to work with hours in timestamps, only dates.

Result

I am now using the below stored function to achieve my desired result. It will ignore time outside of business hours (08:00:00 - 17:00:00) and ignore weekends. It will essentially only calculate the business hour difference between two timestamps.

DROP FUNCTION IF EXISTS BUSINESSHOURSDIFF;
DELIMITER $$
CREATE FUNCTION BUSINESSHOURSDIFF(start_time TIMESTAMP, end_time TIMESTAMP)
RETURNS INT UNSIGNED
BEGIN
IF HOUR(start_time) > 17 THEN SET start_time = CONCAT_WS(' ', DATE(start_time), '17:00:00');
END IF;
IF HOUR(start_time) < 8 THEN SET start_time = CONCAT_WS(' ', DATE(start_time), '08:00:00');
END IF;
IF HOUR(end_time) > 17 THEN SET end_time = CONCAT_WS(' ', DATE(end_time), '17:00:00');
END IF;
IF HOUR(end_time) < 8 THEN SET end_time = CONCAT_WS(' ', DATE(end_time), '08:00:00');
END IF;
RETURN 45 * (DATEDIFF(end_time, start_time) DIV 7) + 
          9 * MID('0123455501234445012333450122234501101234000123450', 
                  7 * WEEKDAY(start_time) + WEEKDAY(end_time) + 1, 1) + 
          TIMESTAMPDIFF(HOUR, DATE(end_time), end_time) - 
          TIMESTAMPDIFF(HOUR, DATE(start_time), start_time);
END $$
DELIMITER ;

`WHERE HOUR(jobs.time_created) >8 AND HOUR(jobs.time_created) <17` etc ?? — , Mar 03 '16 at 00:23
@Dagon This won't work, this will only look for timestamps greater than 8 am and before 5pm. I have edited the question with an example of current outcome and desired outcome. — Matt, Mar 03 '16 at 00:55
@Strawberry I have added an sqlfiddle, with the schema, the query, and the current result. The expected result is in the bold in my question. — Matt, Mar 07 '16 at 00:34
@Matt Dragon probably mean this .. you have filter time by valid business time and exclude (WHERE time not is SAT or SUN ... and then you can sum the hours — daremachine, Mar 07 '16 at 00:48
@Strawberry sorry .. It is hard to know. I thought so every record is daily not start and end. Let me think about this. It's interesting problem. — daremachine, Mar 07 '16 at 01:00
@Matt, So you want to know three things: 1. The number of valid hours on the first day. 2. The number of valid hours on the last day. 3. The number of valid days in between. Right? From there, it's a simple calculation. (You may also have to allow for the fact that the first day and the last day are actually the same day!) — Strawberry, Mar 07 '16 at 01:11

score 4 · Accepted Answer · edited Mar 07 '16 at 04:05

Its possible, but very very ugly using sql only. However, if you can use stored functions, then its quite pretty to look at as well.

From the SO question you linked in your question, we know the following expression calculates the number of weekdays between two dates:

5 * (DATEDIFF(@E, @S) DIV 7) + 
    MID('0123455501234445012333450122234501101234000123450', 
        7 * WEEKDAY(@S) + WEEKDAY(@E) + 1, 1)

If we multiply this expression by 9, i.e. # working hours per working day, we get the business hours diff. Adding the hours adjustment between the two timestamps gives us the final expression which we can then average

45 * (DATEDIFF(@E, @S) DIV 7) + 
      9 * MID('0123455501234445012333450122234501101234000123450', 
              7 * WEEKDAY(@S) + WEEKDAY(@E) + 1, 1) + 
      TIMESTAMPDIFF(HOUR, DATE(@E), @E) - 
      TIMESTAMPDIFF(HOUR, DATE(@S), @S)

So, the ugly but working query is:

SELECT 
  clients.name
, AVG(45 * (DATEDIFF(jobs.time_updated, jobs.time_created) DIV 7) + 
          9 * MID('0123455501234445012333450122234501101234000123450', 
                  7 * WEEKDAY(jobs.time_created) + WEEKDAY(jobs.time_updated) + 1, 1) + 
          TIMESTAMPDIFF(HOUR, DATE(jobs.time_updated), jobs.time_updated) - 
          TIMESTAMPDIFF(HOUR, DATE(jobs.time_created), jobs.time_created)) AS average_response
, AVG(45 * (DATEDIFF(jobs.time_closed, jobs.time_created) DIV 7) + 
          9 * MID('0123455501234445012333450122234501101234000123450', 
                  7 * WEEKDAY(jobs.time_created) + WEEKDAY(jobs.time_closed) + 1, 1) + 
          TIMESTAMPDIFF(HOUR, DATE(jobs.time_closed), jobs.time_closed) - 
          TIMESTAMPDIFF(HOUR, DATE(jobs.time_created), jobs.time_created)) AS average_closure
, COUNT(jobs.id) AS ticket_count 
, SUM(time_total) AS time_spent 
FROM jobs
LEFT JOIN clients ON jobs.client = clients.id 
WHERE jobs.status = 'closed' 
GROUP BY jobs.client

A better alternative would be to create a stored function that handles the business hours diff logic.

DROP FUNCTION IF EXISTS BUSINESSHOURSDIFF;
DELIMITER $$    
CREATE FUNCTION BUSINESSHOURSDIFF(start_time TIMESTAMP, end_time TIMESTAMP) 
RETURNS INT UNSIGNED
BEGIN
RETURN 45 * (DATEDIFF(end_time, start_time) DIV 7) + 
          9 * MID('0123455501234445012333450122234501101234000123450', 
                  7 * WEEKDAY(start_time) + WEEKDAY(end_time) + 1, 1) + 
          TIMESTAMPDIFF(HOUR, DATE(end_time), end_time) - 
          TIMESTAMPDIFF(HOUR, DATE(start_time), start_time);
END $$
DELIMITER ;

And then invoke this as required.

SELECT 
    clients.name
  , avg(BUSINESSHOURSDIFF(jobs.time_created, jobs.time_updated)) AS average_response
  , avg(BUSINESSHOURSDIFF(jobs.time_created, jobs.time_closed)) AS average_closure
  , count(jobs.id) AS ticket_count
  , SUM(time_total) AS time_spent 
FROM jobs
LEFT JOIN clients ON jobs.client = clients.id 
WHERE jobs.status = 'closed' 
GROUP BY jobs.client;

You've omitted the business hours part for the first and last day (hint: greatest(), least()) And I think the correct string is: 0123455401234434012332340122123401101234000123450 — Strawberry, Mar 07 '16 at 01:17
Sorry I didn't follow what that meant @Strawberry, could you please elaborate. thanks. Are you implying that `time_created`, `time_updated` & `time_closed` can be outside business hours, and in those cases only the business hour difference is required? — Haleemur Ali, Mar 07 '16 at 01:19
@HaleemurAli Fantastic work, I am using the stored function in MySQL. It is indeed possible that `time_created`, `time_updated` & `time_closed` can be outside business hours, is it possible the query can only calculate the business hour difference? — Matt, Mar 07 '16 at 02:39
@Matt, The base of the solution is already there, and the final correction for times outside of business hours is fairly trivial, try it, and if you encounter any problems, please post on SO — Haleemur Ali, Mar 07 '16 at 03:19
@HaleemurAli I've actually modified the stored function to achieve what I wanted, I just wanted you to take a look at it (rather new with stored functions, not sure if I can improve on the syntax) - http://pastebin.com/SCXTn50j — Matt, Mar 07 '16 at 03:20
@HaleemurAli By the way, I will award you the bounty in 20 hours. — Matt, Mar 07 '16 at 03:20
@Matt, as far as the function goes, it seems alright. But please tell me what the adjustment should be in the following cases: start = saturday @ 7 am, start = tuesday @ 6 am, start = wednesday @ 8 pm, end = sunday @ 4 pm, end = tuesday @ 6 am, end = wednesday @ 8 pm. — Haleemur Ali, Mar 07 '16 at 03:24
@HaleemurAli I've just tested those examples and everything is working as intended. They will result in 0 hours being returned. — Matt, Mar 07 '16 at 03:37

DRapp · Answer 2 · 2016-03-07T04:55:39.250

Ok, This might really hurt your head some using MySQL @variables. They work like an inline program statement in that when you assign via :=, they can be used in the next sql column being queried thus simplifying your logic without heavy date math all the time.

First, here is the entire query. Then I'll break it down...

select
        pq.id,
        pq.client,
        c.name,
        sum( pq.UpdHours ) as ResponseHours,
        sum( pq.dayHours ) as TotHours,
        sum( pq.TimeOnlyOnce ) as TotalTime
    from
(select
        j.id,
        j.client,
        j.time_created, 
        j.time_updated,
        if( jdays.DaySeq = 0, time_total, 0 ) as TimeOnlyOnce,
        @justDay := date_add( date( j.time_created ), interval jdays.DaySeq day ) as JustTheDay,
        @dtS := date_add( @justDay, interval 8 hour ) as StoreOpen,
        @dtE := date_add( @justDay, interval 17 hour ) as StoreClosed,
        @isWkDay := IF( DAYOFWEEK(@justDay) in ( 1, 7 ), 0, 1 ) as IsWeekDay,
        @dtST := greatest( j.time_created, @dtS ) as StartTime,
        @dtUpd := least( j.time_updated, @dtE ) as TimeUpdate,
        @dtET := least( j.time_closed, @dtE ) as EndTime,
        if( @isWkDay, TIMESTAMPDIFF( HOUR, @dtST, @dtUpd ), null ) as UpdHours,
        if( @isWkDay, TIMESTAMPDIFF( HOUR, @dtST, @dtET ), null ) as dayHours,
        jdays.DaySeq
    from
        jobs j
          JOIN ( select @dayLoop := @dayLoop +1 as DaySeq
                   from jobs js,
                      ( select @dayLoop := -1 ) sqlvars
                   limit 10 ) jdays
            ON jdays.DaySeq <= TIMESTAMPDIFF( DAY, j.time_created, j.time_closed),
        ( select 
                @justDay := '2016-01-01',
                @dtS := @justDay,
                @dtE := @justDay,
                @dtST := @justDay,
                @dtET := @justDay,
                @dtUpd := @justDay,
                @isWkDay := 0) sqlvars2
    order by
        j.id,
        j.client,
        jdays.DaySeq) pq
           LEFT JOIN clients c 
                  ON pq.client = c.id 
    group by
        pq.id

First, I am starting with the inner-most query with

JOIN ( select @dayLoop := @dayLoop +1 as DaySeq
          from jobs js,
               ( select @dayLoop := -1 ) sqlvars
          limit 10 ) jdays

This builds a sub-table alias "jdays" to represent just a day sequence from 0 to 10 days (if you need more than 10 days for any single activity, just extend the limit). I am starting the @dayLoop with -1, so when joining to your jobs table (assuming in reality it will have more than 10 records), it will grab 10 rows with values respectively 0, 1, 2, --- 9. This prevents needing some bogus table on how many records to represent the total time a given job may run, do it on the fly.

Next is the join between the JOBS table, the above subquery representing multiple days which IS INTENTIONAL to create a Cartesian result, and the next part

( select 
        @justDay := '2016-01-01',
        @dtS := @justDay,
        @dtE := @justDay,
        @dtST := @justDay,
        @dtET := @justDay,
        @dtUpd := @justDay,
        @isWkDay := 0) sqlvars2

Which does nothing but create some variables representing just the day in question, the date/time the store hours open/end (8am/5pm), the specific ticket id start/end time if continuation between days, also the update time (response time), and a flag column if the day in question is a week day or not. This just declares the variables within the sql statement without having to external declares.

Now, the next level query where I am using all the @variables. Think of it as analyzing each individual row one at a time and getting Cartesian result against the jDays alias result.

I want to ONLY look at your SECOND ticket ID

ID    Client  time_created         time_updated         time_closed          time_total 
6412  106     2016-03-04 08:00:00  2016-03-07 08:00:00  2016-03-07 17:00:00   .25

If you run this INNER QUERY ALONE, for this SINGLE ID, the join to the jDays table is based on the total days from created to closed is GREATER than the jDays values (such as 0, 1, 2, 3,....). Why create multiple rows? Because each day needs to be assessed on its own merits. So, getting to one data element at a time, I am computing the total_time record only ONCE so that is based on the IF() for the daySeq = 0, so it doesn't get accounted for multiple times when split against different rows. (Mar 4, 5, 6 and 7th)

if( jdays.DaySeq = 0, time_total, 0 ) as TimeOnlyOnce,

Now the dates. Just for grins, lets assume our time_created is actually some mid-afternoon value, such as 2016-03-04 13:15:00 (1:15pm afternoon). I want JUST THE DAY stripping the time portion. Date( j.time_created ) returns only the date portion.

@justDay := date_add( date( j.time_created ), interval jdays.DaySeq day ) as JustTheDay,

results in '2016-03-04'. Now, I add 8 hours and 17 hours respectively for when the store opens and closes and would result with the following, and a flag if weekend or not.

@dtS := date_add( @justDay, interval 8 hour ) as StoreOpen,
@dtE := date_add( @justDay, interval 17 hour ) as StoreClosed,
@isWkDay := IF( DAYOFWEEK(@justDay) in ( 1, 7 ), 0, 1 ) as IsWeekDay,

JustTheDay  StoreOpen (8am)     StoreClosed  (5pm)
2016-03-04  2016-03-04 8:00:00  2016-03-04 17:00:00

From these baseline values for a given day (and would be repeated for Mar 5, 6 and 7), we now want to know when the ticket time STARTS, was UPDATED and ENDED (closed). So the START time is the GREATER of the time created OR the day start. In the MODIFIED start time per my example, the START time for the ticket would actually be the 1:15pm time and not the 8am of original data, just to give context. The updated and end time are based on the LEAST time. So, since the updated and closed are on Monday following the weekend, we want to stop the clock as of 5pm of the one day (Mar 4th). Similar for the time closing.

So now for the per-row being processed, I can use these START, UPDATE, and END times for TIMESTAMPDIFF() for THE SINGLE DAY. But if it is a weekend, use Null as no time is applicable to the computation.

@dtST := greatest( j.time_created, @dtS ) as StartTime,
@dtUpd := least( j.time_updated, @dtE ) as TimeUpdate,
@dtET := least( j.time_closed, @dtE ) as EndTime,
if( @isWkDay, TIMESTAMPDIFF( HOUR, @dtST, @dtUpd ), null ) as UpdHours,
if( @isWkDay, TIMESTAMPDIFF( HOUR, @dtST, @dtET ), null ) as dayHours,

Now, since the one ticket spans multiple dates, I would have 4 records as below

I have extra columns so you can see the logical flow of the records. Now that we have each ticket (without a where clause), the inner query creates multiple rows each representing one day of the ticket. Now you just SUM the total hours, the hours before the customer was notified and total time (of which only first entry exists per breakdown) and group by the ticket. So this gives each ticket's total response, closure, time.

I know you already checked one answer that works, but hope you like this option too :) Might even be easier to follow and dissect.

For adjusting the respective day of week start/end time, just update the date_add components for # hours based on a given day of week instead of the fixed 8 and 17 respectively. This also considers spanning multiple days inclusive of weekends.
so now the entire thing wrapped-up per client's ticket ID

select
      QryPerID.client,
      QryPerID.name,
      avg( QryPerID.ResponseHours ) as AvgResponseHours,
      avg( QryPerID.TotHours ) as AvgTotHours,
      sum( QryPerID.TotalTime ) as TotalTime,
      count(*) as ClientTickets
   from
      ( entire previous query ) QryPerID
   group by
      QryPerID.client,
      QryPerID.name

score 0 · Answer 3 · answered Mar 07 '16 at 23:55

Build and populate table

CREATE TABLE BusinessDays (
    day DATE NOT NULL,
    PRIMARY KEY (day)
) ENGINE=InnoDB;

It will contain the dates of all future work days. You can remove any national holidays, etc, as needed. (This may be a bonus feature of this solution.)

Your table has start_dt and end_dt as DATETIME and you want to calculate the amount of time between them by your rules.

The following is for readability; it can be combined into a single query for speed/compactness:

-- Worry about intervening days:
SELECT @days := COUNT(*) - 2
    FROM YourTable yt
    JOIN BusinessDays a  ON a.day >= DATE(yt.start_dt)
    JOIN BusinessDays z  ON z.day <= DATE(yt.end_dt);

-- Get hours in first and last days:
SELECT @secs := TIME_TO_SEC(TIMEDIFF(TIME(start_dt), '08:00:00')) +
                TIME_TO_SEC(TIMEDIFF('17:00:00', TIME(end_dt)))
    FROM YourTable;

-- Generate answer:
SELECT @days * 9 + @secs/3600 AS 'Hours';

Simple enough?

I did not want to generate a time-type ouput such as 123:30:00 because this overflows at 840 hours.

MySQL - Average difference between timestamps, excluding weekends and out of business hours

Result

3 Answers3

Linked