1

This is more out of academic interest than any need for a practical solution - I recently wrote a bit of code that does a bunch of filters/sorts/removal of duplicates in a huge file, but as I was stepping through the code it seems like this loop is the main bottleneck and I want to understand why this is so bad, and why the in built functions like filter and sort seem to work so much more quickly doing more difficult operation.

For i = 2 To nRows
    If (Cells(i, 1) <> Cells(i - 1, 1) Or Cells(i, 2) <> Cells(i - 1, 2)) _
       And (Cells(i, 1) <> Cells(i + 1, 1) Or Cells(i, 2) <> Cells(i + 1, 2)) Then
       Rows(i).EntireRow.Delete
       i = i - 1
       nRows = nRows - 1
    End If
Next i
Community
  • 1
  • 1
Nitin
  • 151
  • 1
  • 10
  • 5
    Sheet calls in VBA are very slow (compared to arrays or such). Your IF statement alone makes 8 sheet calls every time you loop. A much faster solution would be to read all the data into an array - Do all processing filtering etc on the array then output the array back into the sheet. – 99moorem Jun 15 '17 at 15:23
  • @99moorem thanks, I tried this and got a significant speed up. If you were to write the comment as an answer I'll accept (otherwise I'll do the same as a community wiki answer). – Nitin Jun 15 '17 at 17:06
  • https://stackoverflow.com/questions/30959315/excel-vba-performance-1-million-rows-delete-rows-containing-a-value-in-less – Slai Jun 15 '17 at 21:58
  • 1
    One thing, aside from the speed issue - you're deleting rows but working down through the rows which is why you're having to adjust `i` and `nRows` each time you delete a row. The accepted way of deleting rows is to work backwards - `For i = nRows To 2 Step -1` - if a row is deleted it doesn't affect the row numbering of any rows above it. – Darren Bartrup-Cook Jun 16 '17 at 14:30
  • very good point +1 @DarrenBartrup-Cook – 99moorem Jun 16 '17 at 14:35

2 Answers2

2

Sheet calls in VBA are very slow (compared to arrays or such). Your IF statement alone makes 8 sheet calls every time you loop. This line: Rows(i).EntireRow.Delete - Makes 4 sheet calls every time you loop.

A much faster solution would be to read all the data into an array - Do all processing filtering etc on the array then output the array back into the sheet.

There is a detailed answer on this with speed comparisons on SO - but I can not find it. If anyone can please add as comment

99moorem
  • 1,955
  • 1
  • 15
  • 27
0

Well, when you loop through each line, and check logic on each line, there is a lot of overhead. You could perform an operation hundreds, thousands, or even hundreds of thousands of times. When you apply a filter, delete what you don't want, unfilter, and sort, you are really only performing four steps in this kind of process.

ASH
  • 20,759
  • 19
  • 87
  • 200