2

I'm looking to find the total time spent in a dataset

My data set could look something like this: (Updated dataset after first suggestions)

Starting time Ending time
44224,32869 44224,33603
44224,30975 44224,33616
44224,30965 44224,32824
44223,34859 44223,46875
44223,41349 44223,44875

This would give me a total time of 9000 - 4000,50 - 500 (from no time being worked between 5000 and 5500) = 4499,5

Finding the minimum starting time and the maximum ending time gives me the range of this set, from which I can subtract data if the next starting time is larger than the last ending time (such as in 5000 to 6000, subtract 1000). However, from the last data point it shows that 500 of these 1000 subtracted time units have been used in the total time. Is there any simple way to find the total time spent from such a data set? Programming in VBA Excel. Thanks for your time!

Currently using this code:

    For i = 7 to lastRow

            If timeEnd(i) < timeStart(i - 1) Then

                subtractTime = subtractTime + (timeStart(i - 1) - timeEnd(i))

            End If
            
            If timeStart(i) < firstTime Then

                firstTime = timeStart(i)
                
            End If
            
            
            If timeEnd(i) > lastTime Then
                
                lastTime = timeEnd(i)
                
            End If

            totalTimeSpent(i) = lastTime - firstTime - subtractTime

    Next i

    

Found a solution that works for my dataset: first sort the data so it is in order, then run the code above.

Henck
  • 23
  • 3
  • Please provide a sample of your code so we can help you with it. – Dean Mar 18 '21 at 15:16
  • Edited, thanks for commenting. – Henck Mar 18 '21 at 15:29
  • 1
    Have a look for gap and island analysis. Most of the references are to SQL implementations eg https://medium.com/analytics-vidhya/sql-classic-problem-identifying-gaps-and-islands-across-overlapping-date-ranges-5681b5fcdb8, but I did a formula-based solution a while ago https://stackoverflow.com/questions/53572815/duration-and-idle-time-for-a-server-from-continuous-dates/53579235#53579235 . Anyway, the first step is to sort the data by start time, then start identifying rows where the start time is later than the latest end time so far (gaps). – Tom Sharpe Mar 18 '21 at 16:03
  • Have drafted a gap and island solution - will post tomorrow when I have tidied it up a bit. It seems to work OK on your revised test data. – Tom Sharpe Mar 19 '21 at 22:46

3 Answers3

1

This is the long way around, it litarly loops the time from lowest to highest and counts if accounted for:

    Function mytime(stRng As Range, edRng As Range)
        Dim stArr() As Variant
        stArr = Intersect(stRng.Parent.UsedRange, stRng).Value
        
        Dim edArr() As Variant
        edArr = Intersect(edRng.Parent.UsedRange, edRng).Value
        If UBound(stArr, 1) <> UBound(edArr, 1) Then Exit Function
        
        Dim cnt As Long
        cnt = 0
        
        For i = Application.Min(stArr) To Application.Max(edArr)
            For j = 1 To UBound(stArr, 1)
                If i >= stArr(j, 1) And i < edArr(j, 1) Then
                    cnt = cnt + 1
                    Exit For
                End If
            Next j
        Next i
        
        mytime = cnt
        
    End Function

enter image description here

Scott Craner
  • 148,073
  • 10
  • 49
  • 81
  • Hi Scott, code works fine for this set, but I am working with continuous values (should have added these in my example data set, my bad) such as 48953,53 to 48953,89. I'll add a new non-simplified dataset. Thanks for your help! – Henck Mar 19 '21 at 07:31
1

Here is a Power Query solution (available in Windows Excel 2010+ and Office 365)

It makes a list of all the unique entries in list which span the time spent; and returns the count.

See the M-Code and annotations for more details. Also step through the Applied Steps

M Code

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Starting time", Int64.Type}, {"Ending time", Int64.Type}}),

    //add a list of the times. Since we want duration, exclude the last time in the end time.
    #"Added Custom" = Table.AddColumn(#"Changed Type", "timeList", each {[Starting time]..[Ending time]-1}),

    //remove duplicates and count the remainder
    uniqueTimes = List.Count(
                    List.Distinct(
                        List.Combine(#"Added Custom"[timeList]))),

    //convert to a table for display
    #"Converted to Table" = #table({"Time Spent"}, {{uniqueTimes}})
in
    #"Converted to Table"

enter image description here

And here is a VBA UDF (user defined function) that uses the same algorithm:

Option Explicit
Function timeSpent(starts As Range, ends As Range) As Long
    Dim vStart, vEnd
    Dim I As Long, J As Long
    Dim colTimes As Collection
    
'Transfer ranges to vba arrays for faster processing
vStart = starts
vEnd = ends

'Sanity Check
If UBound(vStart) <> UBound(vEnd) Then
    MsgBox "Start and End time ranges must be the same size"
    Exit Function
End If

'create unique list of times

Set colTimes = New Collection

'A collection object will return an error if you try to add an item with a duplicate key
On Error Resume Next
For I = 1 To UBound(vStart, 1)
    For J = vStart(I, 1) To vEnd(I, 1) - 1
        colTimes.Add Item:=J, Key:=CStr(J)
    Next J
Next I
On Error GoTo 0
        
timeSpent = colTimes.Count

End Function
Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
  • Hi Ron, code works fine for this set, but I am working with continuous values (should have added these in my example data set, my bad) such as 48953,53 to 48953,89. I'll add a new non-simplified dataset. Thanks for your help! – Henck Mar 19 '21 at 07:34
  • @Henck Yes, a much more complex algorithm needs to be used in that situation. – Ron Rosenfeld Mar 19 '21 at 12:23
  • Found a simpler solution: first sort the data so that it is in order, then run the original code. Works like a charm. – Henck Mar 19 '21 at 12:28
  • @Henck post back if you run into problems. Glad you have it working. – Ron Rosenfeld Mar 19 '21 at 13:18
1

The gap and island solution actually turns out to be a very short piece of code - most of it is output. I'm using the quicksort posted by Nigel Heffernan

Option Base 1
Option Explicit


Sub GapAndIsland()

    Dim varData As Variant
    Dim minStart As Variant
    Dim maxEnd As Variant
    Dim i As Long
    Dim gap As Variant
    


    ' Set the array

    varData = Range("f2:g6").Value

    ' *** Modified code to remove bad data ***
    
    ' Remove elements that are not number or date

    Call RemoveInvalid(varData, lastUsed)

    ' Sort https://stackoverflow.com/questions/4873182/sorting-a-multidimensionnal-array-in-vba

    Call QuickSortArray(varData, 1, lastUsed, 1)

    ' *** end of modified code ***

    ' Initialise
      
    minStart = varData(1, 1)
    maxEnd = varData(1, 1)
    gap = 0
    
    ' Loop over rows
    ' *** This line also modified ***

    For i = 1 To lastUsed
    
    ' If there is a gap, increment total gap length
    
        If (varData(i, 1)) > maxEnd Then
            gap = gap + varData(i, 1) - maxEnd
        End If
        
    ' Update latest end time
        
        If varData(i, 2) > maxEnd Then
            maxEnd = varData(i, 2)
            
        End If
        
    Next i
    
' Output

    Range("I1:j6").Clear
    Range("I1:j1").Font.Bold = True
    Range("I1:J1").HorizontalAlignment = xlCenter
    Range("j2:j3").NumberFormat = "dd/mm/yyyy hh:mm:ss"
    Range("j4:j6").NumberFormat = "[h]:mm:ss"
    
    
    Range("I1").Value = "Measure"
    Range("J1").Value = "Value"
    
    Range("i2").Value = "Start"
    Range("j2").Value = minStart
    
    Range("i3").Value = "End"
    Range("j3").Value = maxEnd
    
    Range("i4").Value = "Duration"
    Range("j4").Value = maxEnd - minStart
    
    Range("i5").Value = "Gaps"
    Range("j5").Value = gap
    
    Range("i6").Value = "Net"
    Range("j6").Value = maxEnd - minStart - gap
    
End Sub

This uses the same continuous test data as in the question, but formatted as date-times:

enter image description here

Here is a UDF version which just returns the net time on task.

Function TimeOnTask(R As Range) As Variant

    Dim varData As Variant
    Dim minStart As Variant
    Dim maxEnd As Variant
    Dim i As Long
    Dim gap As Variant
    


    ' Set the array

    varData = R.Value

    '  *** Modified code to remove bad data ***
    
    ' Remove elements that are not number or date

    Call RemoveInvalid(varData, lastUsed)

    ' Sort https://stackoverflow.com/questions/4873182/sorting-a-multidimensionnal-array-in-vba

    Call QuickSortArray(varData, 1, lastUsed, 1)

    ' *** end of modified code ***


    ' Initialise
      
    minStart = varData(1, 1)
    maxEnd = varData(1, 1)
    gap = 0
    
    ' Loop over rows

    ' *** This line also modified ***

    For i = 1 To lastUsed
    
    ' If there is a gap, increment total gap length
    
        If (varData(i, 1)) > maxEnd Then
            gap = gap + varData(i, 1) - maxEnd
        End If
        
    ' Update latest end time
        
        If varData(i, 2) > maxEnd Then
            maxEnd = varData(i, 2)
            
        End If
        
    Next i
    
    
    TimeOnTask = maxEnd - minStart - gap
    
End Function

** EDIT **

I have tried to make this more robust by adding a short routine to remove rows where the start time or stop time is not a number or date (please also see modified code above):

Sub RemoveInvalid(ByRef arr As Variant, ByRef lastUsed As Long)
    Dim i As Long
    Dim j As Long
    
    j = 0
    
    For i = 1 To UBound(arr, 1)

    ' Increment and copy row if valid

        If (IsNumeric(arr(i, 1)) Or IsDate(arr(i, 1))) And Not (IsEmpty(arr(i, 1))) And _
           (IsNumeric(arr(i, 2)) Or IsDate(arr(i, 2))) And Not (IsEmpty(arr(i, 2))) Then
            j = j + 1
            arr(j, 1) = arr(i, 1)
            arr(j, 2) = arr(i, 2)
        End If
    Next i
    
    lastUsed = j
              
        
End Sub

It's not necessary to redim the array because you can just specify the last used row in the call to the quicksort and in the gap and island loop.

Tom Sharpe
  • 30,727
  • 5
  • 24
  • 37