0

Are Collections in vba not as efficient as arrays when it comes to long lists of strings?

My vba-Tool is not as fast as i want it to be. I use a lot of collections because i don't have to REDIM and also i don't have to use additionally counting-variables.

For example (I want to unite the array a and the collection col in one list, but the tricky part is, that to every array element, there are a certain number of col-elements):

For i = 1 To col.count
        colSave.Add "==========================="
        colSave.Add a(i - 1)
        colSave.Add "==========================="
        For k = 1 To colFilter.Item(i).count
            colSave.Add col.Item(i).Item(k)
        Next k
Next i

Is more efficient to use an array in this case with a third counting variable?

Community
  • 1
  • 1
TimRicta
  • 135
  • 1
  • 2
  • 8
  • 1
    It all depends upon what you are trying to accomplish. For example, it is much easier to move blocks of cells to and from arrays. – Gary's Student Feb 28 '18 at 13:43
  • 3
    [Related question, also includes Dictionaries](https://stackoverflow.com/questions/32479842/comparison-of-dictionary-collections-and-arrays) – AntiDrondert Feb 28 '18 at 13:48
  • 1
    One factor is how the collection or array is created. Reading a range of cells into an array is very efficient with the built in method. Creating a collection then populating it is probably less efficient, again depending on the method used. The most inefficient aspect of strings is their immutability, if you're concatenating you'll probably want to make a string builder. I've posted one on my site here: (warning to be clear: this is my own site, no advertising intended!) https://excelpro.tech/free-downloads.php – Absinthe Feb 28 '18 at 14:01
  • Why is it much easier to move blocks of cells to and from arrays than arrays? – TimRicta Feb 28 '18 at 14:01
  • 1
    When in doubt -- benchmark. – John Coleman Feb 28 '18 at 14:15
  • @Absinthe linking to your own site in the comments when relevant is fine. That seems like an interesting site. – John Coleman Feb 28 '18 at 14:19
  • 1
    A key question is if you need to dynamically grow your data structure. If you find yourself using `ReDim Preserve` in a loop -- switch to a collection ASAP. Otherwise, stick to an array unless you have good reasons not to. In some cases you might want to consider using an `ArrayList` from VB.Net (but usable in VBA via `CreateObject()`) – John Coleman Feb 28 '18 at 14:25
  • 1
    @John Coleman, the most efficient way is always the array. A Collection uses and grows multiple arrays internally. Thus the choice should depend on the number of entries and not if it needs to grow. – Florent B. Feb 28 '18 at 14:47
  • 1
    Looking at the code in your edit, if you wanted to merge an array of string arrays rather than a collection of string collections, you can *quickly* (bu using `UBound()` appropriately) determine ahead of time how big the final array needs to be and `Dim` it just once rather than using `ReDim Preserve`. One advantage of sticking with arrays is that you can use `Join` on arrays if you want to e.g. write it all to a text file in 1 line of code. – John Coleman Feb 28 '18 at 14:47
  • @JohnColeman Thanks, it's a work in progress but I'm hoping to grow it as i find useful info that's not well known. Good point about growing the arrays. – Absinthe Feb 28 '18 at 14:51
  • @FlorentB. `Collection.Add` is `O(1)` (albeit not in a super-efficient way) but a naive use of `ReDim Preserve` is `O(n)`, so I am not sure what you mean when you say that a need to grow the data isn't relevant in the decision. The accepted answer to this question suggests that the implementation of a VBA collection is a weird cross between a doubly-linked list and a hash table: https://stackoverflow.com/q/2548526/4996248 – John Coleman Feb 28 '18 at 15:06
  • 1
    @John Coleman, `Collection.Add` is O(1) but ` Collection.Item` is always O(n) (unless a key is is provided). With an array you can simply double the size when required to keep a complexity of O(1) for the getter and setter. Note that I wrongly assumed that the inner structure was based on an hashtable, which is as you pointed out a double linked list when the key is not provided. Bottom line is that a collection should not be used to store a large amount of data and should not access entries by index. – Florent B. Feb 28 '18 at 15:59
  • 1
    @FlorentB. I agree that arrays are almost always better. Collections are basically stunted dictionaries. The only real use case that I see is when you are building up something item by item and only intend to ever use them in a `for each`. – John Coleman Feb 28 '18 at 16:05

1 Answers1

2

Probably the most efficient way is to list the strings in cells on a worksheet then read the range of those cells into an array. This is a very quick method (ranges of 100k cells read in milliseconds on a reasonably fast PC):

Sub test()

Dim a() As Variant

a = Range("A1:A1000").Value

End Sub

a will now contain those strings.

Note that this method produces a multidimensional base 1 array, not base zero, so for example the first string in the above example would be at index 1,1.

Absinthe
  • 3,258
  • 6
  • 31
  • 70