1

I have seen multiple codes regarding this topic but I can't seem to understand it.

For instance, if I have a column that records people names, I want to record all unique names into the array.

So if I have a column of names

David
Johnathan
Peter
Peter
Peter
Louis
David

I want to utilize VBA to extract unique names out of the column and place it into an array so when I call the array it would return these results

Array[0] = David
Array[1] = Johnathan
Array[2] = Peter
Array[3] = Louis
Sebastian Ong
  • 65
  • 1
  • 2
  • 7
  • An example of source data, results required and what you have tried would be a good start... – Solar Mike Jul 14 '18 at 07:54
  • You can use collection object. https://stackoverflow.com/a/8305733/5829910 – Zsmaster Jul 14 '18 at 07:56
  • 1
    @Zsmaster A `collection` is a poor choice compared to a `Dictionary`. A `Dictionary` has an `Exists` function, which makes checking for existing names much easier, and more importantly, much faster. – ThunderFrame Jul 14 '18 at 08:30
  • 1
    Watch: [Excel VBA Introduction Part 25 - Arrays](https://www.youtube.com/watch?v=h9FTX7TgkpM&list=PLNIs-AWhQzckr8Dgmgb3akx_gFMnpxTN5&index=28) and [Excel VBA Introduction Part 39 - Dictionaries](https://www.youtube.com/watch?v=dND4coLI_B8&list=PLNIs-AWhQzckr8Dgmgb3akx_gFMnpxTN5&index=43) – TinMan Jul 14 '18 at 12:18

5 Answers5

4

Despite a Collection being mentioned and being a possible solution, it is far more efficient to use a Dictionary as it has an Exists method. Then it's just a matter of adding the names to the dictionary if they don't already exist, and then extracting the keys to an array when you're done.

Note that I've made the name comparisons case-sensitive, but you can change that if necessary, to case-insensitive.

Option Explicit

Sub test()

   'Extract all of the names into an array
    Dim values As Variant
    values = Sheet1.Range("Names").Value2 'Value2 is faster than Value

    'Add a reference to Microsoft Scripting Runtime
    Dim dic As Scripting.Dictionary
    Set dic = New Scripting.Dictionary

    'Set the comparison mode to case-sensitive
    dic.CompareMode = BinaryCompare

    Dim valCounter As Long
    For valCounter = LBound(values) To UBound(values)
        'Check if the name is already in the dictionary
        If Not dic.Exists(values(valCounter, 1)) Then
            'Add the new name as a key, along with a dummy value of 0
            dic.Add values(valCounter, 1), 0
        End If
    Next valCounter

    'Extract the dictionary's keys as a 1D array
    Dim result As Variant
    result = dic.Keys

End Sub
ThunderFrame
  • 9,352
  • 2
  • 29
  • 60
  • Why is value2 faster please? – QHarr Jul 14 '18 at 10:25
  • 2
    @QHarr [Probably because it doesn't check the cell format](https://fastexcel.wordpress.com/2011/11/30/text-vs-value-vs-value2-slow-text-and-how-to-avoid-it/) – Ron Rosenfeld Jul 14 '18 at 10:47
  • @RonRosenfeld haha - I was just about to post the same link - it's a great explanation. There's also the [`Value2` help page](https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-value2-property-excel?f=255&MSPPError=-2147217396). *The only difference between this property and the Value property is that the Value2 property doesn't use the Currency and Date data types.* – ThunderFrame Jul 14 '18 at 11:13
  • @TinMan, when using `Exists` how did you then add the item? Using `Add` or `Item`? In my test of 100k names, they were exactly equal, although I was only using `Timer()` resolution. – ThunderFrame Jul 14 '18 at 12:41
  • Thanks to you all. – QHarr Jul 14 '18 at 14:45
4

use Dictionary object and build a Function that returns your array

Function GetUniqeNames(myRng As Range) As Variant
    Dim cell As Range

    With CreateObject("Scripting.Dictionary") ' instantiate and reference a Dictionary object
        For Each cell In myRng ' loop through passed range
            .Item(cell.Value2) = 1 ' store current cell name into referenced dictionary keys (duplicates will be overwritten)
        Next
    GetUniqeNames = .keys ' write referenced dictionary keys into an array
    End With
End Function

that you can exploit in your main code as follows

Sub main()
    Dim myArray As Variant

    With Worksheets("mysheet") ' change "mysheet" to your actual sheet name
        myArray = GetUniqeNames(.Range("A1", .Cells(.Rows.Count, 1).End(xlUp))) ' this will take the referenced sheet column A range from row 1 down to last not empty one
    End With

End Sub
DisplayName
  • 13,283
  • 2
  • 11
  • 19
  • Reading one cell at a time is very slow - better to read all of the values into an array. Also, better to check the key doesn't exist than to update the value every time. And finally, you haven't been clear about case sensitivity. – ThunderFrame Jul 14 '18 at 08:23
  • OP's didn't say anything about data amount, but using an array will certainly speed things up. While checking for existing keys would be slow, too: never made a comparison but I'm always satisfied with overwriting solution. Finally, as to case sensitivity, OP's data were clean and clear – DisplayName Jul 14 '18 at 08:39
  • I assumed the OP wasn't going to include the entire list, and thought it better to provide a solution that scales well. I did a quick check on 100k names, albeit repeats of the 7 names in the OP. There's no discernible difference in your `Item` approach to my `Exists` approach (I guess .`Item` must check `Exists` behind the scenes - good to know). But the array definitely speeds things up. Results: 100k names using `For each cell in rng` = 0.211 seconds, 100k names using an array from the range = 0.039 seconds. About 5.4 times faster. – ThunderFrame Jul 14 '18 at 09:14
  • thanks for the feedback. since `Item` and `Exists` approaches perform the same, the former leads to less code, which I prefer. as for the array approach, it's definitively faster as expected, though 0.2 secs for 100k names should be affordable... – DisplayName Jul 14 '18 at 09:37
2

Is this a VBA question or a question about programming logic? Use a loop on the column with the data. Check each name against the list of existing data items. If it exists in the list, move on the the next name. If it does not exist in the list, add it.

The "list" is a concept, not a concrete tool. It can be a VBA dictionary, if you are comfortable using that. Or it can be a VBA array, which may not perform as fast as a dictionary, but may be more familiar.

Then again, if you add the data to the Excel Data Model, you can use the Distinct aggregation of a pivot table to list out the unique values.

Without more background it's hard to tell if VBA or Data Model is your best approach. Many VBA solutions get created because people are not aware of Excel's capabilities.

teylyn
  • 34,374
  • 4
  • 53
  • 73
  • Checking each cell, one at a time is slow, particularly for large lists. Also, checking if an item exists in an array is a very expensive operation compared to checking if an item exists in a dictionary. See my answer for an example. – ThunderFrame Jul 14 '18 at 08:26
  • Replace "array" with "dictionary" and the concept is the same. I was trying to provide programming logic that is independent of a particular tool. I'll adjust my answer to reflect that. – teylyn Jul 14 '18 at 08:38
2

You could use Excel functionality like that.

Sub UniqueNames()

Dim vDat As Variant
Dim rg As Range
Dim i As Long

    Set rg = Range("A1:A7")

    rg.RemoveDuplicates Columns:=Array(1), Header:=xlNo
    With ActiveSheet
        vDat = WorksheetFunction.Transpose(.Range("A1:" & .Range("A1").End(xlDown).Address))
    End With

    For i = LBound(vDat) To UBound(vDat)
        Debug.Print vDat(i)
    Next i

End Sub

Code is based on your example data, i.e. I put your data into column 1. But the code will also alter the table. If you do not want that you have to use other solutions or put the data beforehand in a temporary sheet.

Storax
  • 11,158
  • 3
  • 16
  • 33
0

If you dont want to use "Scripting.Dictionary" and your excel does not have Worksheet.unique(...) like mine

Public Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
    If UBound(arr) >= 0 Then
        IsInArray = Not IsError(Application.Match(stringToBeFound, arr, 0))
    Else
        IsInArray = False
    End If
End Function

Public Function GetUniqueValuesFromColumn(ws As Worksheet, sourceColNum As Long, Optional firstRow As Long = 2) As Variant
    Dim val As String
    Dim i As Long
    Dim arr() As Variant
    arr = Array()
    For i = firstRow To ws.Cells(Rows.Count, sourceColNum).End(xlUp).Row
        val = ws.Cells(i, sourceColNum)
        If Not IsInArray(val, arr) Then
            ReDim Preserve arr(UBound(arr) + 1)
            arr(UBound(arr)) = val
        End If
    Next i
    GetUniqueValuesFromColumn = arr
End Function

Then call it like GetUniqueValuesFromColumn(ThisWorkbook.Worksheets("SomeList"), 1)

Tomas Trdla
  • 1,142
  • 1
  • 11
  • 24