6

Using Excel and VBA, I wanted some advice on how to best filter data in an array (in the same sort of way one might use a pivot table) strictly using VBA. I am creating a UserForm that is going to make some data decisions based on currently existing data. I can visualize how to do it well enough but am not that versed in VBA programming.

Here is an example

A       B       C
bob     12      Small
sam     16      Large
sally   1346    Large
sam     13      Small
sally   65      Medium
bob     1       Medium

To grab the data in an Array, I could use

Dim my_array As Variant

my_array = Range("A1").CurrentRegion

Now, I am familiar with looping through 2D arrays, but I wondered: what the most effective way to filter 2D array data (without looping through the array time and again)?

For example, how do I get would be to say get this kind of data:

data_for_sally As Variant 'rows with sally as name in ColA
data_for_sally_less_than_ten As Variant ' all rows with sally's name in ColA and colB < 10
data_for_all_mediums as Variant ' all rows where ColC is Medium

Suggestions? I could work this out with a bunch of custom functions and loops but I thought there must be a better way. Thanks.

Jean-François Corbett
  • 37,420
  • 30
  • 139
  • 188
thornomad
  • 6,707
  • 10
  • 53
  • 78
  • Note that the 4th example is not a filter but an operation on the array, which would probably lead to a different answer. – assylias May 04 '12 at 14:22
  • Not sure it's possible without looping / custom functions in VBA. You say you have experience in other languages, have you considered a VSTO/.NET impementation then use LINQ? – lori_m May 04 '12 at 14:59
  • For this type of thing in VBA I would use a disconnected ADO recordset. It gives you sorting and filtering. – Tim Williams May 04 '12 at 15:17
  • @assylias good point, removed that example; .@lori_m i was thinking along the lines of Python, actually; .@Tim Williams i will have to look into that, haven't heard of it before. – thornomad May 04 '12 at 21:37

2 Answers2

5

I assume you want to use VBA only.

I think it depends on several parameters, mainly on:

  • how often you run the same condition => do you store the result of a filter or do you recalculate every time?
  • how often you need to filter stuff => if often, it is worth having a proper code structure in place, if not then a one off loop is clearly the way to go.

From an OO perspective, assuming performance (speed & memory) is not an issue, I would go for the following design (I won't go into the details of the implementation, only give the general idea). Create a class (let's call it imaginatively ArrayFilter) that you could use like this.

Setup the filter

Dim filter As New ArrayFilter
With filter
    .name = "sam"
    .category = "Medium"
    .maxValue = 10
End With

Or

filter.add(1, "sam") 'column 1
filter.add(3, "Medium") 'column 3
filter.addMax(2, 10) 'column 2

Create the filtered data set

filteredArray = getFilteredArray(originalArray, filter)

The getFilteredArray is fairly straightforward to write: you loop over the array checking if the values match the filter and put the valid lines in a new array:

If filter.isValidLine(originalArray, lineNumber) Then 'append to new array

Pros

  • Clean design
  • Reusable, especially with the second version where you use the column number. This can be used to filter any arrays really.
  • Filtering code is in one function that you can test
  • Corollary: avoid duplication of code

Cons

  • Filtering is recalculated every time, even if you use the same filter twice. You can store the results in a Dictionary for example - see below.
  • Memory: every call to the getFilteredArray creates a new array, but not sure how this can be avoided anyway
  • This adds quite a few lines of code, so I would do it only if it helps make the code easier to read / maintain.

ps: If you need to cache the results to improve performance, one way would be to store the results in a dictionary and add some logic to the getFilteredArray function. Note that unless your arrays are really big and/or you run the same filter a lot, this is probably not worth it.

filters.add filter, filteredArray 'filters is a dictionary

That way, when you call getFilteredArray next time, you can do something like this:

For each f in filters
    'Check if all conditions in f and newFilter are the same
    'If they are:
    getFilteredArray = filters(f)
    Exit Function
Next

'Not found in cache: compute the result
assylias
  • 321,522
  • 82
  • 660
  • 783
  • That's an interesting approach - I was hoping that VBA had something built in I could use on the fly. But, re-using an OO approach may have to be the next best thing. Thanks for the detailed reply. Will have to try it out. – thornomad May 04 '12 at 21:39
  • Rereading it, getFilteredArray should probably be a function inside the ArrayFilter class. – assylias May 05 '12 at 07:01
  • What is full form of OO? – Fennekin Nov 05 '21 at 10:19
3

Try this

' credited to ndu
Function Filter2DArray(ByVal sArray, ByVal ColIndex As Long, ByVal FindStr As String, ByVal HasTitle As Boolean)
  Dim tmpArr, i As Long, j As Long, Arr, Dic, TmpStr, Tmp, Chk As Boolean, TmpVal As Double
  On Error Resume Next
  Set Dic = CreateObject("Scripting.Dictionary")
  tmpArr = sArray
  ColIndex = ColIndex + LBound(tmpArr, 2) - 1
  Chk = (InStr("><=", Left(FindStr, 1)) > 0)
  For i = LBound(tmpArr, 1) - HasTitle To UBound(tmpArr, 1)
    If Chk Then
      TmpVal = CDbl(tmpArr(i, ColIndex))
      If Evaluate(TmpVal & FindStr) Then Dic.Add i, ""
    Else
      If UCase(tmpArr(i, ColIndex)) Like UCase(FindStr) Then Dic.Add i, "" 'This finds only exact matches, if you need *FindStr* use:  If UCase(tmpArr(i, ColIndex)) Like UCase("*" & FindStr & "*") Then Dic.Add i, ""

    End If
  Next
  If Dic.Count > 0 Then
    Tmp = Dic.Keys
    ReDim Arr(LBound(tmpArr, 1) To UBound(Tmp) + LBound(tmpArr, 1) - HasTitle, LBound(tmpArr, 2) To UBound(tmpArr, 2))
    For i = LBound(tmpArr, 1) - HasTitle To UBound(Tmp) + LBound(tmpArr, 1) - HasTitle
      For j = LBound(tmpArr, 2) To UBound(tmpArr, 2)
        Arr(i, j) = tmpArr(Tmp(i - LBound(tmpArr, 1) + HasTitle), j)
      Next
    Next
    If HasTitle Then
      For j = LBound(tmpArr, 2) To UBound(tmpArr, 2)
        Arr(LBound(tmpArr, 1), j) = tmpArr(LBound(tmpArr, 1), j)
      Next
    End If
  End If
  Filter2DArray = Arr
End Function
Community
  • 1
  • 1
Đức Thanh Nguyễn
  • 9,127
  • 3
  • 21
  • 27