1

I am running the below algorithm on an array of 100k items (I would like to expand it to 1m if possible) and am facing performance issues. I would like to reduce the processing time down to a few seconds. I am pretty sure the first loop is causing the code to slow down. Is it possible to automatically shuffle the array 250 times, to populate a matrix and then simply paste it? - instead of going through 250 separate connections between vba and excel.

For a = 0 To 249        
        For i = UBound(myArray , 1) To 1& Step -1&
                NewRow= Int(Rnd() * i) + 1&
                temp = myArray (i, 1)
                myArray (i, 1) = myArray (NewRow, 1)
                myArray(NewRow, 1) = temp
        Next i            
    Range(Cells(1, a + 1), Cells(LastRow, a + 1)).Value = myArray                     
Next a

any help is appreciated!! Thanks! John

  • 2
    Why are you shuffling 250 times? Implement a [correct shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) and execute it once. Adding iterations doesn't make it "more random". – Blorgbeard Jan 10 '18 at 19:05
  • As you can see I want to populate a 100,000x250 matrix which is why I need to shuffle the array 250 times. The first loop is used to increment the columns of the matrix. – John Analytics Jan 10 '18 at 19:11
  • Oh I see what you're doing now. Yes, you can set up a 2d array and assign it to a range. Example: https://stackoverflow.com/questions/34733061/excel-vba-assign-values-of-2d-array-to-range-of-cells – Blorgbeard Jan 10 '18 at 19:18
  • 1
    This question is best answered by someone (probably you) running some benchmarks and seeing where the actual hotspots are. Without knowing _where_ the time is spent your guesses are probably wrong. –  Jan 10 '18 at 19:20
  • 2
    Taking the cell-writing instruction out of the outer loop would probably speed this up by an order of magnitude already. Populate one single 2D array, drop it onto the worksheet *once*. – Mathieu Guindon Jan 10 '18 at 19:24
  • @Blorgbeard Correct me if I am wrong but you are suggesting that I first create a 100,000x250 array and then shuffle it? – John Analytics Jan 10 '18 at 19:27
  • @jdv each iteration of i takes 0.3seconds i.e. more than a minute for 250 columns and I need to run this simulation several times which is why it is not ideal – John Analytics Jan 10 '18 at 19:30
  • You need to know what in the loop is taking so long, otherwise unrolling selectively may do nothing. Benchmark and keep doing so so you can see the how things change as you tweak the code. Related: https://stackoverflow.com/q/7103552/1531971 –  Jan 10 '18 at 19:33
  • No, I'm suggesting you create a 100000x250 array, populate it via shuffling the source array repeatedly, and then assign it to a range. – Blorgbeard Jan 10 '18 at 19:46
  • @Blorgbeard oh I see yea this would actually be much faster! I am not sure how to implement this though .... Assuming I define the array as myArray = Range("A1:A100000").Value How should I proceed to augment the array from (100000,1) to (100000,250)? Thanks a lot!! – John Analytics Jan 10 '18 at 19:51
  • I'd make a new array `Dim output(100000,250) as Integer` and fill that up, then `Range(whatever) = output` at the end. – Blorgbeard Jan 10 '18 at 20:23
  • @Blorgbeard Populating the new 100000x250 array still requires me to go through two For statements though correct? For x = 1 to 100000 For y = 1 to 250 output(x,y) = 1 (for example) – John Analytics Jan 10 '18 at 21:38
  • Yes, but that's fine. Nested loops are "slow" only because the number of iterations is high. There's no way around the fact that you have to write 100000x250 integers. A single loop from 0 to (100000*250) would be about the same speed as the nested loops. The point is that you'd only only writing to the Excel sheet once - that's the slow bit. – Blorgbeard Jan 10 '18 at 22:18
  • @Blorgbeard makes sense and it already looks like my code runs much faster! Thanks a lot! – John Analytics Jan 10 '18 at 22:26
  • @Mat'sMug I just realized you had suggested the same thing - thanks a lot! – John Analytics Jan 10 '18 at 22:27
  • No problem. May I suggest to get rid of the type hint characters? `1` and `-1` would work exactly the same as `1&` and `-1&`, without the need to look twice to see the literal integer value. Also I don't see a need for reversing that loop at all, why would the bottom-to-top matter, if the thing is *shuffling* (unless I missed something)? Reduce the useless cognitive load on that code, it'll be easier to maintain. – Mathieu Guindon Jan 10 '18 at 22:32
  • @Mat'sMug I set this up for an earlier version of the script and it made sense at the time but you're completely right thanks for the tips! – John Analytics Jan 10 '18 at 22:38

0 Answers0