4

At risk of being of topic, I decided to share some code, Q&A-style. If the general opinion is such that this would be off-topic I'll be happy to delete if need be.


Background

Can we retrieve all unique values from any 1D-array, or Range object turned into 1D-array, without having to iterate over its elements? As far as I'm concerned the general consensus is that one has to iterate over the different elements, where the best way to do it would either be a dictionary or collection to store unique values in. Here is what I've found works very well for this purpose.


Question

So how would one go about retrieving unique elements from a 1D-array, for example:

Dim arr As Variant: arr = Array("A", "A", "C", "D", "A", "E", "G")

Where the resulting array would be:

{"A", "C", "D", "E", "G"}
JvdV
  • 70,606
  • 8
  • 39
  • 70
  • 2
    My vote is that this is a valid Q&A. – BigBen Jan 10 '20 at 14:10
  • https://stackoverflow.com/help/self-answer and there is an "Answer your own question – share your knowledge, Q&A-style" checkbox when you ask a question to show that it's fine. As it is technically impossible to find unique values on a typical sequential computer without iterating, I think the title is a bit misleading of what the actual question is. – Slai Jan 12 '20 at 12:06

4 Answers4

6

Really all code needed are just a few lines:

Sub test()

Dim arr As Variant: arr = Array("A", "A", "C", "D", "A", "E", "G")
With Application
    uniques = .Index(arr, 1, Filter(.IfError(.Match(.Transpose(.Evaluate("ROW(1:" & UBound(.Match(arr, arr, 0)) & ")")), .Match(arr, arr, 0), 0), "|"), "|", False))
End With

End Sub

The above will return a 1D-array, returning all unique elements in our original array:

enter image description here


Explaination:

The line that retrieves all these values looks intense, so let's break it into pieces:


enter image description here

Application.Match has the ability to work with arrays within it's parameters. So basically we are looking at: .Match({"A","A","C","D","A","E","G"},{"A","A","C","D","A","E","G"},0). The returned array would then be: {1,1,3,4,1,6,7}, and what this really is are the first positions where each value is found at. This result will be the basis on where we build on further.


enter image description here

We can see a third .Match in our code which we need to basically tell the following: .Match({1,2,3,4,5,6,7},{1,1,3,4,1,6,7},0). The first parameter is what is retrieved by the above higlighted code.

Where .Evaluate("ROW(1:" & UBound(.Match(arr, arr, 0)) & ")") will return an array of values from 1-7, the Application.Transpose will return it such that it's a 1D-array.


enter image description here

The last step will return an array holding errors, however the code won't break since we are using Application instead of WorksheetFunction. The resulting array will look like {1,Error 2042,3,4,Error 2042,6,7}. Now the whole point is to get rid of the Error values.

The way to do so is through Application.IfError, which will evaluate the array and change all error values into a give string value. In our case I used the pipe symbol. It's up to the user to decide on a symbol unique enough it won't appear in any of the elements in the original array. So after evaluation. Our current array will look like: {1,|,3,4,|,6,7}.


enter image description here

Now we retrieved an array with pipe symbols we would want them out! A quick way to do so is with Filter function. Filter returns an array with or without the elements that fit our criteria (depending on the TRUE or FALSE in it's third paramter).

So basically we want to return an array like so: Filter(<array>, "|", False). The resulting 1D-array now looks like: {1,3,4,6,7}.


enter image description here

We kind of have it at this point. We just need to slice out the correct values from our original array. To do so we can use Application.Index. We just want to tell .Index which rows we are interested in. And to do so we can load our previously found 1D-array. So the code will look like: .Index(arr1, <array>, 1) which will result in a 1D-array: {"A","C","D","E","G"}


Conclusion:

There you have it. One line (with more than just a single operation) to retrieve a 1D-array of unique values from another 1D-array without iteration. This code is ready to be used on any 1D-array declared with arr.

Usefull? I'm not 100% sure, but I finally reached what I was trying in my project. The resulting array can be used immediately in whichever task you need to use unique values in.

Comparison: Dictionary vs Application.Methods:

Doing a comparison on random items in the Range(A1:A50000), the performance really takes a hit. Hereby a time-comparison between the iterative Dictionary against the non-iterative Application.Methods approach in 1000 items steps. Below the result of a 1000 items and each 10000 items mark (in seconds):

| Items     | Dictionary    | Methods       |
|-------    |------------   |-------------  |
| 1000      | 0,02          | 0,03          |
| 10000     | 0             | 0,88          |
| 20000     | 0,02          | 3,31          |
| 30000     | 0,02          | 7,3           |
| 40000     | 0,02          | 12,84         |
| 50000     | 0,03          | 20,2          |

The Dictionary approach used:

Sub Test()

Dim arr As Variant: arr = Application.Transpose(Range("A1:A50000"))
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")

Dim x As Long

For x = LBound(arr) To UBound(arr)
    dict(arr(x)) = 1
Next x

Dim uniques As Variant: uniques = dict.Keys

End Sub

Conclusion: Up to a 1000 items this method would be about equal in processing time compared to a more common Dictionary practice. On anything larger, iteration (through memory) will always beat the methods approach!

I'm sure processing time would be more limited with the new Dynamic Array functions as shown by @ScottCraner.

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • 2
    Good work, and a very clear explanation. You are putting together a nice library of things - perhaps you could put them together somewhere. – SJR Jan 10 '20 at 14:23
  • 3
    IMHO the solution with a dictionary is much clearer and simpler. Even the code in the link you provided could be shortened and would win clarity. So I would not go for this solution here in this post. – Storax Jan 10 '20 at 14:27
  • @Storax, I would have to agree with you 100%. It however wasn't my intention to tell which procedure is better or not. My idea was that the general consensus was that it wasn't possible to retrieve unique values without having to iterate. That being said, once you got this line, all you need to change is `arr` variable. – JvdV Jan 10 '20 at 14:35
  • 1
    @JvdV: Yes, that's right a dictionary solution would need a loop, your's not. – Storax Jan 10 '20 at 14:38
6

With the new Dynamic Array functions it can be simplified to:

Sub test()

Dim arr As Variant: arr = Array("A", "A", "C", "D", "A", "E", "G")
With Application
    Dim uniques as variant
    uniques = .Transpose(.Unique(.Transpose(arr)))
End With

End Sub

enter image description here

The new Uniques Formula needs a vertical array, and it can be 2d. It acts like Range.RemoveDuplicate without the ability to choose columns.

Scott Craner
  • 148,073
  • 10
  • 49
  • 81
  • Againt all odds, this is what I tried at first. Only to realize I'm not lucky enough to yet have access to it. But yes, this is going to kick my examples digital butt as soon as I do =) + For this elegant alternative. – JvdV Jan 10 '20 at 15:04
  • Very glad to see more and more examples of the new functions being put to use. – BigBen Jan 10 '20 at 16:06
  • @ScottCraner FYI Posted an alternative to your Office 365 solution **without** need to transpose *twice* – T.M. Jul 24 '20 at 17:51
2

Approach via FilterXML()

Just to enrich the variety of fine solutions above, I demonstrate an approach via the new worksheet function FilterXML().

Sub testUniqueItems()
'   Purp: list unique items
'   Site: https://stackoverflow.com/questions/59683363/unique-values-from-1d-array-without-iteration
    Dim arr As Variant: arr = Array("A", "A", "C", "D", "A", "E", "G")
'[1]get uniques
    Dim uniques
    uniques = UniqueXML(arr)
'[2]display in Immediate Window: A,A,C,D,A,E,G => A,C,D,E,G
    Debug.Print Join(arr, ",") & " => " & _
                Join(uniques, ",")
End Sub
Function UniqueXML(arr, Optional Delim As String = ",")
  ' Purp: return unique list of array items
  ' Note: optional argument Delim defaulting to colon (",")
  ' Help: https://learn.microsoft.com/de-de/office/vba/api/excel.worksheetfunction.filterxml
  ' [1] get array data to xml node structure (including root element)
    Dim wellformed As String
    wellformed = "<root><i>" & Join(arr, "</i><i>") & "</i></root>"
  ' ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ' [2] define XPath string searching unique item values
  ' Note: c.f. udf: https://stackoverflow.com/questions/58677041/vba-excel-how-to-display-non-equal-values-in-an-excel-array/58685756#58685756
  ' ------------------------------------------------
  ' //i                    ... all <i> node values after the DocumentElement
  ' [not( .=preceding::i)] ... only if not preceded by siblings of the same node value
  ' ------------------------------------------------
    Dim myXPath As String
    myXPath = "//i[not( .=preceding::i)]"
  ' [3a] get (delimiter separated) unique list
    UniqueXML = Evaluate("=TEXTJOIN(""" & Delim & """,,FILTERXML(""" & wellformed & """, """ & myXPath & """))")
  ' [3b] return array
    UniqueXML = Split(UniqueXML, Delim)
End Function

Related links

MS Help

Display non equal values in an Excel array

Caveat

Note that the Worksheet function FilterXML() can be used from vers. 2016+, but TextJoin only in vers. 2019+ (thx @FaneDuru for comments:-)

Furthermore you would have to be aware of the limits of evaluate. 255 chars only (thx @JvDv).

To overcome both obstacles I reworked above function to work in versions 2016+, too.

Modified function /Edit as of 2020-08-20

Function UniqueXML(arr, Optional Delim As String = ",")
  ' Purp: return unique list of array items
  ' Note: optional argument Delim defaulting to colon (",")
  ' Help: https://learn.microsoft.com/de-de/office/vba/api/excel.worksheetfunction.filterxml
  ' [1] get array data to xml node structure (including root element)
    Dim wellformed As String
    wellformed = "<root><i>" & Join(arr, "</i><i>") & "</i></root>"
  ' ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ' [2] define XPath string searching unique item values
  ' Note: c.f. udf: https://stackoverflow.com/questions/58677041/vba-excel-how-to-display-non-equal-values-in-an-excel-array/58685756#58685756
  ' ------------------------------------------------
  ' //i                    ... all <i> node values after the DocumentElement
  ' [not( .=preceding::i)] ... only if not preceded by siblings of the same node value
  ' ------------------------------------------------
    Dim myXPath As String
    myXPath = "//i[not( .=preceding::i)]"
   
  ' [3] get "flat" 1-dim array (~> one-based!)
    Dim tmp As Variant
    tmp = Application.Transpose(WorksheetFunction.FilterXML(wellformed, myXPath))
'  ' [3a] optional redim as zero-based array
'    ReDim Preserve tmp(LBound(tmp) - 1 To UBound(tmp) - 1)
        
  ' [4] return function result
    UniqueXML = tmp
End Function
T.M.
  • 9,436
  • 3
  • 33
  • 57
  • 1
    Nice =), but be well aware of the limits of evaluate. 255 chars only – JvdV Jan 26 '20 at 20:55
  • I tried the above code and `UniqueXML = Evaluate("=TEXTJOIN...` returns `Error 2029` (Excel 2016 Professional Pro - 64 bit)... – FaneDuru Aug 20 '20 at 07:37
  • Thx @FaneDuru, overlooked that `TEXTJOIN` is definitively version 2019+. – T.M. Aug 20 '20 at 16:46
  • Ups... At least I know where the problem is (now...). – FaneDuru Aug 20 '20 at 16:52
  • 1
    @FaneDuru, seems that the single code line `UniqueXML = Application.Transpose(WorksheetFunction.FilterXML(wellformed, myXPath))` replacing sections **`[3a]`** and **`[3b]`** solves the versioning issue; could you verify plz :-; – T.M. Aug 20 '20 at 17:09
  • @FaneDuru - above line should overcome the 255 chars limitation of `Evaluate` as well :-) – T.M. Aug 20 '20 at 17:16
  • 1
    Yes. Tested and it works. You can adapt the code, mentioning that it works in 2016, too. – FaneDuru Aug 20 '20 at 17:16
  • 1
    Even better... :) – FaneDuru Aug 20 '20 at 17:17
2

Applying the Unique() function without double transposition (Office 365)

As addition to @ScottCraner 's Office 365 solution an alternative without need to transpose twice:

Sub testUniques()
Dim arr:     arr = Array("A", "A", "C", "D", "A", "E", "G")  ' example data
Dim uniques: uniques = Application.Unique(arr, True)         ' return function result
'optional display in VB Editor's immediate window
    Debug.Print Join(arr, ",") & " ~> " & Join(uniques, ",") ' A,A,C,D,A,E,G ~> A,C,D,E,G
End Sub

Explanation to additional argument by_col

Due to the Unique function reference its syntax is UNIQUE(array,[by_col],[exactly_once]), where

"the by_col argument is a logical value indicating how to compare. TRUE will compare columns against each other and return the unique columns."

Setting the by_col argument to True allows to compare the array items against each other as they are considered as "columns" in a "flat" 1-dimensional array.

T.M.
  • 9,436
  • 3
  • 33
  • 57