0

I have data in columns 1 and 2 of Sheet 1 (115 rows). These are my reference values. I have data in columns 3 and 4 in ActiveSheet (10000+). These are my raw data values.

The code I have below will find the closest raw data value to the reference value by taking the difference, finding the values between 0 and 15 minutes, finding the minumum value and copying other data from that row over to another location.

I use 2 auto filters in the process to do this for every reference value, so it happens 115 times. My questions are as follows:

  1. Are adding/remove the filters slowing down my search? Should I just look through all 10,000 data values rather than filtering for the same day first?
  2. When I find the value of the minimum value of a filtered list, is there a better way to quickly copy over data from other columns of the same row?

I have added a timer at the end of this code to help quantify. I hope you can help!!

Sub UpdatedTimeMatcherwithFilters()
    'Make the code as fast as possible
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual

    Dim MinValue, MinRow As Integer
    Dim searchRange, Rng As Range
    Dim elapsedTime As Integer
    startTime = Time
    Set searchRange = Range("G1:G1697")

    'Count the reference values
    TotalRefVal = Worksheets("Sheet1").Cells(Rows.Count, 5).End(xlUp).Row

    'Count the Raw Data Values
    Set Rng = Range("C2:C1000")
    'TotalRawDataVal = ActiveSheet.Cells(Rows.Count, 3).End(xlUp).Row

    'Format the Columns as needed
    Columns("G:G").Select
    Selection.NumberFormat = "[h]:mm:ss;@"
    Columns("I:I").Select
    Selection.NumberFormat = "[$-409]m/d/yy h:mm AM/PM;@"
    Columns("H:H").Select
    Selection.NumberFormat = "m/d/yy;@"

    For j = 2 To TotalRefVal
        'Filter for date of reference value
        ActiveSheet.Range("$C:$C").AutoFilter Field:=1, Operator:=xlFilterValues, _
        Criteria2:=Array(2, "10/10/2014")

        'Find the difference in times between Reference Data and Raw Data,
        'put the difference value in the same row
        For Each d In Rng.SpecialCells(xlCellTypeVisible)
            ActiveSheet.Cells(d.Row, 7) = Worksheets("Sheet1").Cells(j, 5) - d
        Next d

        'Turn off filter in column C
        ActiveSheet.AutoFilterMode = False

        'Filter the Time Differences between 0 and 15 minutes
        ActiveSheet.Range("$G:$G").AutoFilter Field:=1, Criteria1:=">0:00:00", _
        Operator:=xlAnd, Criteria2:="<0:15:00"

        'Find the Minimum value between 0 and 15 minutes
        MinValue = Application.Min(searchRange.SpecialCells(xlCellTypeVisible))

        'Find the Row of the value of the minimum difference
        MinRow = Application.Match(MinValue, searchRange, 0)

        'Copy the Data from columns C and D of minimum value's row
        Cells(j, 9).Value = Cells(MinRow, 3)
        Cells(j, 10).Value = Cells(MinRow, 4)
        ActiveSheet.AutoFilterMode = False
    Next j

    stopTime = Time
    elapsedTime = (stopTime - startTime) * 24 * 60 * 60
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
    MsgBox "Elapsed time, screen updating on: " & elapsedTime & _
    " sec."

    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
End Sub
L42
  • 19,427
  • 11
  • 44
  • 68
Trolley_Theif
  • 69
  • 1
  • 1
  • 8
  • This code currently takes 5 seconds per reference value which adds up to about 600 seconds, and will get longer when I add 20,000 more timestamps to my raw data. thats why I chose to filter it per day. – Trolley_Theif Oct 22 '14 at 00:30
  • 2
    This sounds like it is more suitable for codereview.stackexchange.com – Parker Oct 22 '14 at 00:30
  • Might be more efficient to load all your reference values as keys in a dictionary, then scan though the data one time (maybe after first pulling it into an array), updating the dictionary values whenever you hit a row with that reference value and a shorter time period. You can use a struct or simple class instance as the dictionary value, so you can store both the shortest time period and the corresponding row number where that was found. Not sure if you need to handle ties. – Tim Williams Oct 22 '14 at 00:40
  • Filter unfilter does not slow down the search but the code execution (even if you add *ScreenUpdating = False*). I don't think *Calculation* improves anything so might as well remove that. And yes, @TimWilliams is right to evaluate it altogether. Dictionary offers a lot of advantages, but I think you can also use simple arrays which ever you're most comfortable. Also [check this out](http://stackoverflow.com/questions/10714251/how-to-avoid-using-select-in-excel-vba-macros) which will improve your code. – L42 Oct 22 '14 at 01:06
  • @L42 thanks for that link, def great knowledge for improving my run time. I can use arrays but whats my approach? – Trolley_Theif Oct 22 '14 at 20:57
  • @TimWilliams I will learn about dictionary, I don't know how to use that right now. thanks! – Trolley_Theif Oct 22 '14 at 20:57

1 Answers1

1

Why don't you order your raw data first (they are dates/time right?)? Because then you could partition your raw data into sectors, and a given reference value would only look in that sector. It doesn't take that long to order a set of data by a column (and even do a secondary sorting on another column).

The advantage being you only have to do it once for however many references you'll use on those ordered data...

EDIT to make answer more explicit (after comment )

No. I don't think you need to re-organized your data all that much at all. See this excel file: enter image description here

I'm working on barcodes, but you could order by dates as well or any other value. The "Random" column is there just so that I can replace my data in random orders after I've tested my code.

Let's say I'm sorting it all by barcodes (there's 200 rows in this sheet, but you could base the principle on however many rows you have), with something like this. First, you need to call a function that sorts all your raw data. You can do primary sort (I sort by column A first), then in case of equality I have a 2nd sorting value. You can have more than that, just search SORT method:

Private Sub sorting_all()
Dim test As Range
Set test = Range("J" & 200)

Sheet1.Range("A1", Sheet1.Cells(200, 10)).Sort key1:=Sheet1.Range("A1"), order1:=xlAscending, key2:=Sheet1.Range(Columns(8).Address()), order2:=xlDescending, Header:=xlYes, Orientation:=xlSortColumns

End Sub

Then you would have a function that finds a given value:

Function findValue(myValue As Long) As Range
  Dim numIntervales As Integer, startAt As Long, i As Integer, myIntervales As Variant, cutoff As Long 'as Long not Integer so you don't overflow
  numIntervales = 4 'or whatever, set this according to your data. Could determine this programatically
  myIntervales = getIntervales(numIntervales)
  For i = 1 To numIntervales - 1    'Because if you want 4 intervales, that means 3 cutoff points
    cutoff = Sheet1.Cells(myIntervales(i), 1).Value
    If myValue <= cutoff Then
      startAt = myIntervales(i - 1)   'If myValue < cutoff #1, then you want to start at myIntervales point between 0-1
      Exit For
    ElseIf i = numIntervales Then
      startAt = myIntervales(numIntervales - 1)
    End If
  Next i
  Set findValue = Sheet1.Cells.Find(What:=myValue, After:=Sheet1.Cells((startAt + 1), 1), LookIn:=xlFormulas, LookAt:= _
  xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious, MatchCase:=False)
End Function

That functions rely on finding the range of your data, so this function:

Function findRange(mySheet As Worksheet, byRow_Or_byCol As String) As Range
'Just to find the extend of your data

If byRow_Or_byCol = "byRow" Then
  Set findRange = mySheet.Cells.Find(What:="*", After:=mySheet.Cells(1, 1), LookIn:=xlFormulas, LookAt:= _
  xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious, MatchCase:=False)
Else
  Set findRange = mySheet.Cells.Find(What:="*", After:=mySheet.Cells(1, 1), LookIn:=xlFormulas, LookAt:= _
  xlPart, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious, MatchCase:=False)
  End If

End Function

And also on dividing up your raw data into appropriate intervales, so something like this:

Function getIntervales(numIntervales) As Variant
Dim myIntervales() As Integer, i As Integer, myRange As Range, myStep As Long
Set myRange = findRange(Sheet1, "byRow")
myStep = Round(myRange.Row / (numIntervales)) 'In my case, 200 rows & 4 intervales = 50 rows per intervale

For i = 0 To (numIntervales - 1)  'Because array index start @ 1
  ReDim Preserve myIntervales(i)  '
  myIntervales(i) = myStep * (i) + 1 'Because row(0) doesn't exist
Next i

getIntervales = myIntervales

End Function

Finally you need some sort of main to sort coordinate all this:

Private Sub main()
  Call sorting_all
  MsgBox ("This is you result " & findValue(Sheet2.Cells(1, 1).Value).Row)
End Sub

That's partly based on some random test code for a project I'm on, so it's not very finished but you'll get the idea (for example the range for the sort method is hard-coded, obviously you want to find it programatically).

You could get fancier with this, for example when you import your data you just add the data at the end of the (already sorted) data, and then call the sorting function only on newly added data (so you don't sort 10 000 rows if you only added 900 unsorted rows to it).

You could also determine programatically the intervales you set, for example by saying you want intervales on 1000 or 2000 rows (just determines what works best for speed/efficiency).

Finally, if you're going to be adding lots of data, you could as you mentionned devise a system so that once your sheet gets to 10 000 rows (for example), it starts a new sheet and works off that one... Again if the data are sorted, you can code it so you know that Sheet1 has data from DATE XXXX to DATE YYYY, etc.

logicOnAbstractions
  • 2,178
  • 4
  • 25
  • 37
  • thats exactly what I was thinking by filtering. Are you saying I should put each sector into a different sheet? Or I could split the rawdata by columns for each day – Trolley_Theif Oct 22 '14 at 20:59
  • That's a more precise description on what I mean (and also some goods chunks of code to get you started). You'll have to edit & adapt to your data but that's a good head start. One thing you'll have to look at carefully is how to sort DATE datatype. I've done it before but I think you have to make your column datatype is right. You might want to set it somewhere in your code to be sure.... – logicOnAbstractions Oct 25 '14 at 01:49