0

I have tried to use datediff however it only gives the hour difference between two dates. I would like to receive only the hours that are during the working days and working hours. Example provided below:

Start Date: 19/08/2022 09:42:13

End Date: 22/08/2022 09:54:22

Outcome I receive from Datediff("HH", [Start Date], [End Date]) = 72.20

The actual outcome I would like would be approx. = 8.13

  • It looks like you need a script. Is TERR available to you? As a general comment, there is no unique definition of working hours (and not necessarily or working days either). You need to provide one and consider holidays too (which complicates it a lot) – Gaia Paolini Oct 25 '22 at 15:15
  • Hi! I do have TERR and the excel function "NETWORKDAYS" I just cannot figure this one out since the functions in spotfire do not take dates in median calculations. If you have any it would be really helpful. – Spotfire_new Oct 26 '22 at 08:35

2 Answers2

1

OK so you mean something like NetWorkHours rather than days. (I don't understand your comment about medians.)

Assuming you have a data table with a column for the start date and a column for the end date (and these are of DateTime data type) you could create a TERR expression that takes the two dates as input and produces a number of worked hours as output.

I have used this answer as a start: How to calculate networkdays minus holidays between 2 dates

but that does not cover hours. So here is my suggested solution. The idea is to initially remove the start and end day (which are incomplete days) and calculate the number of whole days minus weekends as in the previous solution. Then simply multiply it by the working hours in a day. Then take the first and last day and calculate the hours worked. Then add the two together.

Create a TERR expression function (from menu Data > Data Function Properties > Expression Function)

#start and end of working hours
startShift=8 
endShift=17
#fill vector with holiday dates if required. Example:
holidayDates <- c(as.Date('18/04/2022',format='%d/%m/%Y'),as.Date('29/08/2022',format='%d/%m/%Y'))

#count complete days excluding holidays and weekends
allDays = seq.Date(from=as.Date(input1)+1, to=as.Date(input2)-1, by=1)
nonHolidays = as.Date(setdiff(allDays, holidayDates), origin="1970-01-01")
weekends =nonHolidays[weekdays(nonHolidays) %in% c("Saturday", "Sunday")]
nonHolidaysWeekends = as.Date(setdiff(nonHolidays, weekends), origin="1970-01-01")
hoursCompleteDays = length(nonHolidaysWeekends) *(endShift-startShift)

#count worked hours for first and last day
beginTime = as.POSIXlt(input1)
beginHour = beginTime$hour + beginTime$min/60
endTime = as.POSIXlt(input2)
endHour = endTime$hour + endTime$min/60
hoursFirstAndLastDay = (endShift-beginHour)+(endHour-startShift)

#add together
output = hoursCompleteDays + hoursFirstAndLastDay   

call the TERR expression function e.g. TERR_netWorkingHours. This will give you the total hours worked.

Use it by creating a calculated column as:

TERR_netWorkingHours([startDate],[endDate])

where [startDate] and [endDate] are your original columns.

Gaia Paolini
  • 1,044
  • 1
  • 5
  • 4
  • Hi and thank you for a very thorough explanation I have to be honest tried EXACTLY this! It just do not seem to become correct. Like when I tried yours it still gives the wrong output. I cannot find anything wrong in the calculations that you either have made. For an example using your method gives the following outcome: Start: 14/10/2022 06:39:16 End; 14/10/2022 07:49:02 Outcome: 10.17 What it should be is 0 in this case. – Spotfire_new Oct 27 '22 at 15:11
  • yes I did not consider the case in which the start and end day are the same. Let me have a look – Gaia Paolini Oct 27 '22 at 16:42
  • I think the issue lays in more than that. Another example is that startDate: 20/06/2022 07:56:48 endDate: 28/06/2022 13:54:05 Gives 14.97 which is way to low. it should be around 60 I believe. I do not understand Spotfire calculations but logically I see no issue with the way you have written it – Spotfire_new Oct 27 '22 at 18:04
1

There are two main reasons why my previous answer did not work. Firstly, I tried to adapt an existing answer that worked for full days, but there are too many boundary cases when expanding it to fractional days (e.g. what if my start or end dates are the same, what if they fall over a weekend etc...). Secondly, a TERR expression function expects to work with vectorized inputs, which does not really work with a scenario with so many exceptions on the values of the inputs.

I think it works now (at least for my examples) if I create a TERR data function instead, which outputs a whole new table. I used the R library data.table to make it a bit more efficient. I heavily modified the algorithm to vectorize the steps into a temporary data table (schedule_df). There may be a more clever way, but I did not find it.

You might be able to just output a column by modifying I/O.

Here it is, hope it helps:

suppressWarnings(suppressPackageStartupMessages(library(data.table)))
setDT(dt)

######## main function
netWorkingHours = function(input1, input2) {
  #Helper function
  extractHour = function(x) {
    x = as.POSIXlt(x)
    return (x$hour + x$min/60)
  }
  #prepare ---
  dotimes=FALSE
  #start and end of working hours
  startShift=8 
  endShift=17
  weekend = c('Saturday','Sunday')
  
  #process
  input1d  = as.Date(input1)
  input2d  = as.Date(input2)
  #list all days including start and end
  allDays = seq.Date(from=input1d, to=input2d, by=1)
  Ndays=length(allDays)
  #flag included days: if they are not weekends
  #can be expanded to holidays
  include=ifelse(weekdays(allDays) %in% c('Saturday','Sunday'),0,1)
  
  #start building schedule
  schedule_df=data.table(day=allDays,include=include)
  schedule_df$index=c(1:Ndays)
  #identify boundary days
  schedule_df[,boundary:=ifelse(index==1 | index==Ndays,index,0)]
  
  #initialize working hours
  schedule_df$start=startShift
  schedule_df$end=endShift
  #modify start and end hours for boundary days
  
  schedule_df[boundary==1 & max(boundary)>1, start :=extractHour(input1)]
  schedule_df[boundary==1 & max(boundary)>1, start :=extractHour(input1)]
  schedule_df[boundary==1 & max(boundary)>1, end :=endShift]
  schedule_df[boundary==1 & max(boundary)==1, start :=extractHour(input1)]
  schedule_df[boundary==1 & max(boundary)==1, end :=extractHour(input2)]
  schedule_df[boundary>1 , start :=startShift]
  schedule_df[boundary>1 , end :=extractHour(input2)]
  
  #shift start and end hours by shift limits
  schedule_df[,start1:=sapply(start,function(x) max(startShift,x))]
  schedule_df[,end1  :=sapply(end,function(x) min(endShift,x))]
  
  #calculate worked hours for each day
  schedule_df$worked_hours=0
  schedule_df[include==1,worked_hours:=ifelse(end1>start1,end1-start1,0)]
  Nincluded = nrow(schedule_df[include==1])
  output = ifelse(Nincluded>0,sum(schedule_df[include==1,'worked_hours']),0)
  return (output)
}

######################## main

dt[,workedHours:= mapply(netWorkingHours,dt[['date1']],dt[['date2']])]
Gaia Paolini
  • 1,044
  • 1
  • 5
  • 4
  • Hi, when I try this expression i receive the error that it cannot convert "dt" to data.table by reference beause binding is locked. Did you receive this for the code as well? – Spotfire_new Mar 08 '23 at 10:51
  • No, I have never seen this before. Are you trying it within Spotfire or in RStudio? – Gaia Paolini Mar 08 '23 at 12:18
  • This is for spotfire I have also downloaded the package data.table. It says that i should "Try copying the object to your currenct environment, ex: var <- copy(var) and then using setDT again. I do really not understand what they mean though. – Spotfire_new Mar 08 '23 at 13:01
  • it might be the way you import the input parameter. I cannot tell without seeing how you define it – Gaia Paolini Mar 08 '23 at 13:41
  • I am using the TERR-function you wrote above and when calling it i am typing; Networkinghours([Start_date], [End_date]) – Spotfire_new Mar 08 '23 at 14:11
  • my first answer was a TERR expression function, which did not work too well. My second answer is a TERR data function, which you call in a different way. Please check for instance Dr Spotfire sessions like https://www.youtube.com/watch?v=IMw4P3INThU – Gaia Paolini Mar 08 '23 at 14:32
  • So for this error i added on top of setDT(dt), "dt=data.table()". This makes the function being executed. I however just receive an empty column when executing this? I am using two filled columns of datetimes and I am taking an column as an output. So the "workedHours" comes out just empty. Is there anything I am missing here? – Spotfire_new Mar 09 '23 at 13:33