37

I am trying to find a way to filter large data and remove rows in a worksheet, in less than one minute

The goal:

  • Find all records containing specific text in column 1, and delete the entire row
  • Keep all cell formatting (colors, font, borders, column widths) and formulas as they are

.

Test Data:

Test data:

.

How the code works:

  1. It starts by turning all Excel features Off
  2. If the workbook is not empty and the text value to be removed exists in column 1

    • Copies the used range of column 1 to an array
    • Iterates over every value in array backwards
    • When it finds a match:

      • Appends the cell address to a tmp string in the format "A11,A275,A3900,..."
      • If the tmp variable length is close to 255 characters
      • Deletes rows using .Range("A11,A275,A3900,...").EntireRow.Delete Shift:=xlUp
      • Resets tmp to empty and moves on to the next set of rows
  3. At the end, it turns all Excel features back On

.

The main issue is the Delete operation, and total duration time should be under one minute. Any code-based solution is acceptable as long as it performs under 1 minute.

This narrows the scope to very few acceptable answers. The answers already provided are also very short and easy to implement. One performs the operation in about 30 seconds, so there is at least one answer that provides an acceptable solution, and other may find it useful as well

.

My main initial function:

Sub DeleteRowsWithValuesStrings()
    Const MAX_SZ As Byte = 240

    Dim i As Long, j As Long, t As Double, ws As Worksheet
    Dim memArr As Variant, max As Long, tmp As String

    Set ws = Worksheets(1)
    max = GetMaxCell(ws.UsedRange).Row
    FastWB True:    t = Timer

    With ws
        If max > 1 Then
            If IndexOfValInRowOrCol("Test String", , ws.UsedRange) > 0 Then
                memArr = .Range(.Cells(1, 1), .Cells(max, 1)).Value2
                For i = max To 1 Step -1

                    If memArr(i, 1) = "Test String" Then
                        tmp = tmp & "A" & i & ","
                        If Len(tmp) > MAX_SZ Then
                           .Range(Left(tmp, Len(tmp) - 1)).EntireRow.Delete Shift:=xlUp
                           tmp = vbNullString

                        End If
                    End If

                Next
                If Len(tmp) > 0 Then
                    .Range(Left(tmp, Len(tmp) - 1)).EntireRow.Delete Shift:=xlUp
                End If
                .Calculate
            End If
        End If
    End With
    FastWB False:   InputBox "Duration: ", "Duration", Timer - t
End Sub

Helper functions (turn Excel features off and on):

Public Sub FastWB(Optional ByVal opt As Boolean = True)
    With Application
        .Calculation = IIf(opt, xlCalculationManual, xlCalculationAutomatic)
        .DisplayAlerts = Not opt
        .DisplayStatusBar = Not opt
        .EnableAnimations = Not opt
        .EnableEvents = Not opt
        .ScreenUpdating = Not opt
    End With
    FastWS , opt
End Sub

Public Sub FastWS(Optional ByVal ws As Worksheet = Nothing, _
                  Optional ByVal opt As Boolean = True)
    If ws Is Nothing Then
        For Each ws In Application.ActiveWorkbook.Sheets
            EnableWS ws, opt
        Next
    Else
        EnableWS ws, opt
    End If
End Sub

Private Sub EnableWS(ByVal ws As Worksheet, ByVal opt As Boolean)
    With ws
        .DisplayPageBreaks = False
        .EnableCalculation = Not opt
        .EnableFormatConditionsCalculation = Not opt
        .EnablePivotTable = Not opt
    End With
End Sub

Finds last cell with data (thanks @ZygD - now I tested it in several scenarios):

Public Function GetMaxCell(Optional ByRef rng As Range = Nothing) As Range

    'Returns the last cell containing a value, or A1 if Worksheet is empty

    Const NONEMPTY As String = "*"
    Dim lRow As Range, lCol As Range

    If rng Is Nothing Then Set rng = Application.ActiveWorkbook.ActiveSheet.UsedRange
    If WorksheetFunction.CountA(rng) = 0 Then
        Set GetMaxCell = rng.Parent.Cells(1, 1)
    Else
        With rng
            Set lRow = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _
                                        After:=.Cells(1, 1), _
                                        SearchDirection:=xlPrevious, _
                                        SearchOrder:=xlByRows)
            If Not lRow Is Nothing Then
                Set lCol = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _
                                            After:=.Cells(1, 1), _
                                            SearchDirection:=xlPrevious, _
                                            SearchOrder:=xlByColumns)

                Set GetMaxCell = .Parent.Cells(lRow.Row, lCol.Column)
            End If
        End With
    End If
End Function

Returns the index of a match in the array, or 0 if a match is not found:

Public Function IndexOfValInRowOrCol( _
                                    ByVal searchVal As String, _
                                    Optional ByRef ws As Worksheet = Nothing, _
                                    Optional ByRef rng As Range = Nothing, _
                                    Optional ByRef vertical As Boolean = True, _
                                    Optional ByRef rowOrColNum As Long = 1 _
                                    ) As Long

    'Returns position in Row or Column, or 0 if no matches found

    Dim usedRng As Range, result As Variant, searchRow As Long, searchCol As Long

    result = CVErr(9999) '- generate custom error

    Set usedRng = GetUsedRng(ws, rng)
    If Not usedRng Is Nothing Then
        If rowOrColNum < 1 Then rowOrColNum = 1
        With Application
            If vertical Then
                result = .Match(searchVal, rng.Columns(rowOrColNum), 0)
            Else
                result = .Match(searchVal, rng.Rows(rowOrColNum), 0)
            End If
        End With
    End If
    If IsError(result) Then IndexOfValInRowOrCol = 0 Else IndexOfValInRowOrCol = result
End Function

.

Update:

Tested 6 solutions (3 tests each): Excel Hero's solution is the fastest so far (removes formulas)

.

Here are the results, fastest to the slowest:

.

Test 1. Total of 100,000 records, 10,000 to be deleted:

1. ExcelHero()                    - 1.5 seconds

2. DeleteRowsWithValuesNewSheet() - 2.4 seconds

3. DeleteRowsWithValuesStrings()  - 2.45 minutes
4. DeleteRowsWithValuesArray()    - 2.45 minutes
5. QuickAndEasy()                 - 3.25 minutes
6. DeleteRowsWithValuesUnion()    - Stopped after 5 minutes

.

Test 2. Total of 1 million records, 100,000 to be deleted:

1. ExcelHero()                    - 16 seconds (average)

2. DeleteRowsWithValuesNewSheet() - 33 seconds (average)

3. DeleteRowsWithValuesStrings()  - 4 hrs 38 min (16701.375 sec)
4. DeleteRowsWithValuesArray()    - 4 hrs 37 min (16626.3051757813 sec)
5. QuickAndEasy()                 - 5 hrs 40 min (20434.2104492188 sec)
6. DeleteRowsWithValuesUnion()    - N/A

.

Notes:

  1. ExcelHero method: easy to implement, reliable, extremely fast, but removes formulas
  2. NewSheet method: easy to implement, reliable, and meets the target
  3. Strings method: more effort to implement, reliable, but doesn't meet requirement
  4. Array method: similar to Strings, but ReDims an array (faster version of Union)
  5. QuickAndEasy: easy to implement (short, reliable and elegant), but doesn't meet requirement
  6. Range Union: implementation complexity similar to 2 and 3, but too slow

I also made the test data more realistic by introducing unusual values:

  • empty cells, ranges, rows, and columns
  • special characters, like =[`~!@#$%^&*()_-+{}[]\|;:'",.<>/?, separate and multiple combinations
  • blank spaces, tabs, empty formulas, border, font, and other cell formatting
  • large and small numbers with decimals (=12.9999999999999 + 0.00000000000000001)
  • hyperlinks, conditional formatting rules
  • empty formatting inside and outside data ranges
  • anything else that might cause data issues
Community
  • 1
  • 1
paul bica
  • 10,557
  • 4
  • 23
  • 42
  • 2
    I started looking at your thread from the end - `GetMaxCell` function. One thing to mention - you should remove the dot: `Set GetMaxCell = .Cells(lRow.row, lCol.Column)` should become `Set GetMaxCell = Cells(lRow.row, lCol.Column)`, because you would have inaccurate result if e.g. the whole column `A:A` was empty. However, now I see, that the whole `GetMaxCell` function is useless here? It's **not related** to `deleteRowsWithValuesStrings`! – ZygD Jun 21 '15 at 00:42
  • @ZygD: Thanks. I removed the call to GetMaxCell during testing and forgot to put it back, but I'll update it. Also, I analysed it specifically for reliability - the dot reference is related to the entire area, not just the first column. It starts from the first cell and uses xlPrevious, which makes it wrap around to the last cell of the area, and then keeps moving backwards to the first cell with data – paul bica Jun 21 '15 at 01:04
  • 1
    I'm not sure you understood what I meant. In your current situation, from your data I see that there is no difference between the option with dot and without. However, if you ever decide to use `GetMaxCell` in another project, you should have the version without dot. In my test sheet I had data ranging from B2 to E4 (A:A was empty, 1:1 was empty). The result of the function with the dot was the cell F5, which is obviously incorrect. After removing the dot the result was correct - E4. – ZygD Jun 21 '15 at 01:15
  • 1
    Also, you miss `Set` before `ws =` and `rng =`. – ZygD Jun 21 '15 at 01:16
  • a simple change that makes a reasonable difference is to make a range of the union of all the rows you want to delete then delete them all at once (so instead of your line with .delete now, do something like `if deleteRng is nothing then Set deleteRng = rng Else Set deleteRng = Union(deleteRng, rng.EntireRow)` then after your loop do the delete (deleteRng.delete). – Cor_Blimey Jun 21 '15 at 12:47
  • I think this question really belongs on [Code Review](http://codereview.stackexchange.com/questions/tagged/vba) because the provided code works correctly and it is only the performance which needs to be improved – barrowc Jun 21 '15 at 21:27
  • I'm voting to close this question as off-topic because it belongs on the Code Review StackExchange site – barrowc Jun 21 '15 at 21:28
  • @barrowc I tend to disagree with the off-topic suggestion: SO has a "performance" tag, and also: this is not related to reviewing the code but with finding better solutions to a common problem: removing specific data from large files, and overcoming performance issues regardless of how it is accomplished (code). The activity and feedback on the question show some interest in the topic, so I'd sugest consensus from community before making a decision – paul bica Jun 21 '15 at 22:16
  • The closest Meta question I could find on the issue was [this one](http://meta.stackoverflow.com/q/270427/2127508). It's fair to say that there isn't unanimous agreement on the issue of what belongs on SO vs. what belongs on CR – barrowc Jun 21 '15 at 23:55
  • 1
    @barrowc: This is a grey area indeed. Then to clarify my intent: I am not looking to improve my code, but to find a generic best approach (any code) that addresses the need to filter large data, regardless of the implementation - it can be a new creative idea, or just basic logic, as long as it solves the problem "_the answer arrives too late to be useful_" as mentioned in your link. So it is more related to finding the solution ("_how do I do this_") than "I want to make my algorithm better" - any other algorithm that provides the best solution is acceptable – paul bica Jun 22 '15 at 00:44
  • @paulbica Can you send me the workbook pictured above? I'd love to take a look at it. My email address is : daniel.ferry@gmail.com – Excel Hero Oct 16 '15 at 15:39
  • @ExcelHero I just sent it. Let me know if you need details – paul bica Oct 16 '15 at 23:23
  • Would using PowerQuery be useful in this situation? I'm thinking about looking at the performance on large data when it's in memory and not in the worksheet. – Rick Henderson Jun 23 '17 at 14:47
  • @RickHenderson - it might be; I didn't use it but from what I've read it can handle large data quite well – paul bica Jun 23 '17 at 16:40
  • A bit late but found this very useful. Just wish we had more like this!.. `AdvanceFilter` is a great approach @ExcelHero – Zac Nov 08 '19 at 16:51

5 Answers5

16

I'm providing the first answer as a reference

Others may find it useful, if there are no other options available

  • Fastest way to achieve the result is not to use the Delete operation
  • Out of 1 million records it removes 100,000 rows in an average of 33 seconds

.

Sub DeleteRowsWithValuesNewSheet()  '100K records   10K to delete
                                    'Test 1:        2.40234375 sec
                                    'Test 2:        2.41796875 sec
                                    'Test 3:        2.40234375 sec
                                    '1M records     100K to delete
                                    'Test 1:        32.9140625 sec
                                    'Test 2:        33.1484375 sec
                                    'Test 3:        32.90625   sec
    Dim oldWs As Worksheet, newWs As Worksheet, rowHeights() As Long
    Dim wsName As String, t As Double, oldUsedRng As Range

    FastWB True:    t = Timer

    Set oldWs = Worksheets(1)
    wsName = oldWs.Name

    Set oldUsedRng = oldWs.Range("A1", GetMaxCell(oldWs.UsedRange))

    If oldUsedRng.Rows.Count > 1 Then                           'If sheet is not empty
        Set newWs = Sheets.Add(After:=oldWs)                    'Add new sheet
        With oldUsedRng
            .AutoFilter Field:=1, Criteria1:="<>Test String"
            .Copy                                               'Copy visible data
        End With
        With newWs.Cells
            .PasteSpecial xlPasteColumnWidths
            .PasteSpecial xlPasteAll                            'Paste data on new sheet
            .Cells(1, 1).Select                                 'Deselect paste area
            .Cells(1, 1).Copy                                   'Clear Clipboard
        End With
        oldWs.Delete                                            'Delete old sheet
        newWs.Name = wsName
    End If
    FastWB False:   InputBox "Duration: ", "Duration", Timer - t
End Sub

.

At high level:

  • It creates a new worksheet, and keeps a reference to the initial sheet
  • AutoFilters column 1 on the searched text: .AutoFilter Field:=1, Criteria1:="<>Test String"
  • Copies all (visible) data from initial sheet
  • Pastes column widths, formats, and data to the new sheet
  • Deletes initial sheet
  • Renames the new sheet to the old sheet name

It uses the same helper functions posted in the question

The 99% of the duration is used by the AutoFilter

.

There are a couple limitations I found so far, the first can be addressed:

  1. If there are any hidden rows on the initial sheet, it unhides them

    • A separate function is needed to hide them back
    • Depending on implementation, it might significantly increase duration
  2. VBA related:

    • It changes the Code Name of the sheet; other VBA referring to Sheet1 will be broken (if any)
    • It deletes all VBA code associated with the initial sheet (if any)

.

A few notes about using large files like this:

  • The binary format (.xlsb) reduce file size dramatically (from 137 Mb to 43 Mb)
  • Unmanaged Conditional Formatting rules can cause exponential performance issues

    • The same for Comments, and Data validation
  • Reading file or data from network is much slower than working with a locall file

paul bica
  • 10,557
  • 4
  • 23
  • 42
  • 3
    AutoFilter seems like the best approach, good call. You must have a powerful computer if you can even open a sheet with 1m rows. You could change the codename back using the VBE object model. It requires that "Access to the VBA object model" be enabled in the front end, so would only work for computers you have control of. – Doug Glancy Jun 21 '15 at 00:36
  • @DougGlancy: AutoFilter works well; I expected the copy / paste to take longer, but it's faster than the filter. My machine is about 6 years old actually - I got it back in 2009, but it's an I7 with 9 Gb mem. VBE would be the only way to fix the codename, but I wouldn't impose it on others – paul bica Jun 21 '15 at 00:53
  • This is the process I've used in the past and it is certainly much faster. – n8. Nov 13 '15 at 23:21
  • It seems that after applying AutoFilter, and then .copy seems still copied everthing, not just visible data, if I try .SpecialCells(xlCellTypeVisible).Copy, it shows the run-time error message "Excel cannot create or use the data range reference because it is too complex", similar to that if I use Auto Filter and then try to copy the filtered data manually. I got about 50k records, and about half of them need to be deleted btw. – Passer-by Jun 27 '20 at 19:32
14

A significant gain in speed can be achieved if the source data do not contain formulas, or if the scenario would allow (or want) the formulas to be converted into hard values during the conditional row deletions.

With the above as a caveat, my solution uses the AdvancedFilter of the range object. It's about twice as fast as DeleteRowsWithValuesNewSheet().

Public Sub ExcelHero()
    Dim t#, crit As Range, data As Range, ws As Worksheet
    Dim r&, fc As Range, lc As Range, fr1 As Range, fr2 As Range
    FastWB True
    t = Timer

        Set fc = ActiveSheet.UsedRange.Item(1)
        Set lc = GetMaxCell
        Set data = ActiveSheet.Range(fc, lc)
        Set ws = Sheets.Add
        With data
            Set fr1 = data.Worksheet.Range(fc, fc.Offset(, lc.Column))
            Set fr2 = ws.Range(ws.Cells(fc.Row, fc.Column), ws.Cells(fc.Row, lc.Column))
            With fr2
                fr1.Copy
                .PasteSpecial xlPasteColumnWidths: .PasteSpecial xlPasteAll
                .Item(1).Select
            End With
            Set crit = .Resize(2, 1).Offset(, lc.Column + 1)
            crit = [{"Column 1";"<>Test String"}]
            .AdvancedFilter xlFilterCopy, crit, fr2
            .Worksheet.Delete
        End With

    FastWB False
    r = ws.UsedRange.Rows.Count
    Debug.Print "Rows: " & r & ", Duration: " & Timer - t & " seconds"
End Sub
Excel Hero
  • 14,253
  • 4
  • 33
  • 40
  • Excellent work! AdvancedFilter seems to be at least twice as fast as AutoFilter. I'll update the summary at the top to point it out – paul bica Oct 17 '15 at 23:02
  • on your `crit = [{"Column 1";"<>Test String"}]` line, how would I modify it to filter for multiple criteria? – W-hit Feb 25 '21 at 19:34
  • I really do appreciate this code but I gotta say that it's hard to read exactly what you are doing here with the several ranges. I managed to figure it out and changed it a bit. Either way. This should be the chosen answer. Perhaps the formula limitation is the reason why it wasn't chosen. – DanCue Jul 26 '23 at 18:10
  • I would just edit your code to add the sheet after the sheet you are replacing and then change the name on the sheet to match the one you deleted. – DanCue Jul 26 '23 at 18:11
6

On my elderly Dell Inspiron 1564 (Win 7 Office 2007) this:

Sub QuickAndEasy()
    Dim rng As Range
    Set rng = Range("AA2:AA1000001")
    Range("AB1") = Now
    Application.ScreenUpdating = False
        With rng
            .Formula = "=If(A2=""Test String"",0/0,A2)"
            .Cells.SpecialCells(xlCellTypeFormulas, xlErrors).EntireRow.Delete
            .Clear
        End With
    Application.ScreenUpdating = True
    Range("AC1") = Now
End Sub

took about 10 seconds to run. I am assuming that column AA is available.

EDIT#1:

Please note that this code does not set Calculation to Manual. Performance will improve if the Calculation mode is set to Manual after the "helper" column is allowed to calculate.

Gary's Student
  • 95,722
  • 10
  • 59
  • 99
  • This is a good idea (+1). A small glitch: if there are cells with errors in the original range, they will be deleted. – Ioannis Jun 21 '15 at 15:34
  • @Ioannis Thank you for the feedback.........as for the glitch, I assumed that column **AA** was completely empty and my code ignores errors in the other columns. – Gary's Student Jun 21 '15 at 15:55
  • Great approach. +1 for simplicity and elegance! However, I'm not sure what data, and how much of it you tested to get 10 seconds, because I copied and pasted your code and it's been running now for more than 10 minutes... I still think the deletion of rows is the problem – paul bica Jun 21 '15 at 21:49
  • @paulbica Thanks for the feedback..............you are correct, **the only good way to compare program performance is to run them on the same dataset on the same computer.** – Gary's Student Jun 21 '15 at 22:10
  • Thanks Gary - I really like the solution and intend on finding ways to use it (the code is still running) The only change I made was to remove the last 900K of rows, so it's only using 100K. My data also has some formulas with empty values (=""). Not sure if that makes a difference – paul bica Jun 21 '15 at 22:24
  • It just completed, and I know what happened: I removed 900K rows but didn't update your range, so it did it for 1 Million (took almost 1 hour). I did it again with 100K and it completed in 26.04 seconds in 3 tests (avg) – paul bica Jun 21 '15 at 22:43
  • @paulbica See my **EDIT#1** Once the "helper" column is in place, you can set the Calculation mode to Manual. This may help you gain some speed. – Gary's Student Jun 22 '15 at 00:20
  • Thanks - turning off calculation during execution will never harm anything. In my case, I only have those simple formulas so it may not improve much. The main issue still remains with "EntireRow.Delete" - it takes forever. But its simplicity makes it beneficial for all normal situations. – paul bica Jun 22 '15 at 01:01
1

I know I'm incredibly late with my answer here however future visitors may find it very useful.

Please Note: My approach requires an index column for the rows to end up in the original order, however if you do not mind the rows being in a different order then an index column isn't needed and the additional line of code can be removed.

My approach: My approach was to simply select all the rows in the selected range (column), sort them in ascending order using Range.Sort and then collecting the first and last index of "Test String" within the selected range (column). I then create a range from the first and last indices and use Range.EntrieRow.Delete to remove all the rows which contain "Test String".

Pros:
- It is blazing fast.
- It doesn't remove formatting, formulas, charts, pictures or anything like the method which copies to a new sheet.

Cons:
- A decent size of code to implement however it is all straight-forward.

Test Range Generation Sub:

Sub DevelopTest()
    Dim index As Long
    FastWB True
    ActiveSheet.UsedRange.Clear
    For index = 1 To 1000000 '1 million test
        ActiveSheet.Cells(index, 1).Value = index
        If (index Mod 10) = 0 Then
            ActiveSheet.Cells(index, 2).Value = "Test String"
        Else
            ActiveSheet.Cells(index, 2).Value = "Blah Blah Blah"
        End If
    Next index
    Application.StatusBar = ""
    FastWB False
End Sub

Filter And Delete Rows Sub:

Sub DeleteRowFast()
    Dim curWorksheet As Worksheet 'Current worksheet vairable

    Dim rangeSelection As Range   'Selected range
    Dim startBadVals As Long      'Start of the unwanted values
    Dim endBadVals As Long        'End of the unwanted values
    Dim strtTime As Double        'Timer variable
    Dim lastRow As Long           'Last Row variable
    Dim lastColumn As Long        'Last column variable
    Dim indexCell As Range        'Index range start
    Dim sortRange As Range        'The range which the sort is applied to
    Dim currRow As Range          'Current Row index for the for loop
    Dim cell As Range             'Current cell for use in the for loop

    On Error GoTo Err
        Set rangeSelection = Application.InputBox("Select the (N=) range to be checked", "Get Range", Type:=8)    'Get the desired range from the user
        Err.Clear

    M1 = MsgBox("This is recommended for large files (50,000 or more entries)", vbYesNo, "Enable Fast Workbook?") 'Prompt the user with an option to enable Fast Workbook, roughly 150% performace gains... Recommended for incredibly large files
    Select Case M1
        Case vbYes
            FastWB True  'Enable fast workbook
        Case vbNo
            FastWB False 'Disable fast workbook
    End Select

    strtTime = Timer     'Begin the timer

    Set curWorksheet = ActiveSheet
    lastRow = CLng(rangeSelection.SpecialCells(xlCellTypeLastCell).Row)
    lastColumn = curWorksheet.Cells(1, 16384).End(xlToLeft).Column

    Set indexCell = curWorksheet.Cells(1, 1)

    On Error Resume Next

    If rangeSelection.Rows.Count > 1 Then 'Check if there is anything to do

        lastVisRow = rangeSelection.Rows.Count

        Set sortRange = curWorksheet.Range(indexCell, curWorksheet.Cells(curWorksheet.Rows(lastRow).Row, 16384).End(xlToLeft)) 'Set the sort range

        sortRange.Sort Key1:=rangeSelection.Cells(1, 1), Order1:=xlAscending, Header:=xlNo 'Sort by values, lowest to highest

        startBadVals = rangeSelection.Find(What:="Test String", LookAt:=xlWhole, MatchCase:=False).Row
        endBadVals = rangeSelection.Find(What:="Test String", LookAt:=xlWhole, SearchDirection:=xlPrevious, MatchCase:=False).Row

        curWorksheet.Range(curWorksheet.Rows(startBadVals), curWorksheet.Rows(endBadVals)).EntireRow.Delete 'Delete uneeded rows, deleteing in continuous range blocks is quick than seperated or individual deletions.

        sortRange.Sort Key1:=indexCell, Order1:=xlAscending, Header:=xlNo 'Sort by index instead of values, lowest to highest
    End If

    Application.StatusBar = ""                    'Reset the status bar

    FastWB False                                  'Disable fast workbook

    MsgBox CStr(Round(Timer - strtTime, 2)) & "s" 'Display duration of task

Err:
    Exit Sub

End Sub

THIS CODE USES FastWB, FastWS AND EnableWS BY Paul Bica!

Times at 100K entries (10k to be removed, FastWB True):
1. 0.2 seconds.
2. 0.2 seconds.
3. 0.21 seconds.
Avg. 0.2 seconds.

Times at 1 million entries (100k to be removed, FastWB True):
1. 2.3 seconds.
2. 2.32 seconds.
3. 2.3 seconds.
Avg. 2.31 seconds.

Running on: Windows 10, iMac i3 11,2 (From 2010)

EDIT
This code was originally designed with the purpose of filtering out numeric values outside of a numeric range and has been adapted to filter out "Test String" so some of the code may be redundant.

0

Your use of arrays in calculating the used range and row count may effect the performance. Here's another approach which in testing proves efficient across 1m+ rows of data - between 25-30 seconds. It doesn't use filters so will delete rows even if hidden. Deleting a whole row won't effect formatting or column widths of the other remaining rows.

  1. First, check if the ActiveSheet has "Test String". Since you're only interested in Column 1 I used this:

    TCount = Application.WorksheetFunction.CountIf(sht.Columns(1), "Test String")
    If TCount > 0 Then
    
  2. Instead of using your GetMaxCell() function I simply used Cells.SpecialCells(xlCellTypeLastCell).Row to get the last row:

    EndRow = sht.Cells.SpecialCells(xlCellTypeLastCell).Row
    
  3. Then loop through the rows of data:

    While r <= EndRow
    
  4. To test if the cell in Column 1 is equal to "Test String":

    If sht.Cells(r, 1).Text) = "Test String" Then
    
  5. To delete the row:

    Rows(r).Delete Shift:=xlUp
    

Putting it all together full code below. I've set ActiveSheet to a variable Sht and added turned of ScreenUpdating to improve efficiency. Since it's a lot of data I make sure to clear variables at the end.

Sub RowDeleter()
    Dim sht As Worksheet
    Dim r As Long
    Dim EndRow As Long
    Dim TCount As Long
    Dim s As Date
    Dim e As Date

    Application.ScreenUpdating = True
    r = 2       'Initialise row number
    s = Now     'Start Time
    Set sht = ActiveSheet
    EndRow = sht.Cells.SpecialCells(xlCellTypeLastCell).Row

    'Check if "Test String" is found in Column 1
    TCount = Application.WorksheetFunction.CountIf(sht.Columns(1), "Test String")
    If TCount > 0 Then

        'loop through to the End row
        While r <= EndRow
            If InStr(sht.Cells(r, 1).Text, "Test String") > 0 Then
                sht.Rows(r).Delete Shift:=xlUp
                r = r - 1
            End If
            r = r + 1
        Wend
    End If
    e = Now  'End Time
    D = (Hour(e) * 360 + Minute(e) * 60 + Second(e)) - (Hour(s) * 360 + Minute(s) * 60 + Second(s))
    Application.ScreenUpdating = True
    DurationTime = TimeSerial(0, 0, D)
    MsgBox Format(DurationTime, "hh:mm:ss")
End Sub
  • Thank you for the feedback Andrew. 1. Always a good idea to check for the value at start 2, There are some issues trying to accurately [determine the last cell](http://stackoverflow.com/questions/11169445/error-in-finding-last-used-cell-in-vba) with conventional methods 3. and 4. Interacting with the range in a loop is one of the main performance issues in VBA, especially for very large ranges 5. Try manually deleting 200,000 rows from a 500K sheet - it's extremely slow – paul bica Jun 21 '15 at 22:00
  • This has crashed my computer and runs incredibly slow. – Conor Apr 12 '17 at 14:21