0

I have a worksheet with ~4,000 rows and 300 columns. For my report, I have to remove a bunch columns and only keep about 50 of them, based on the header (in row 1).

I have the following code (obviously only listing 4 of the 50 columns) but this takes about 40 minutes to run. Is there a way to increase the performance of this?

Sub delete_columns()
    Mylist = Array("ID","Status","First_Name","Last_Name")
    LC = Cells(1, Columns.Count).End(xlToLeft).Column

    For mycol = LC To 1 Step -1
        x = ""
        On Error Resume Next
        x = WorksheetFunction.Match(Cells(1, mycol), Mylist, 0)
        If Not IsNumeric(x) Then Columns(mycol).EntireColumn.Delete
    Next mycol
End sub
Pᴇʜ
  • 56,719
  • 10
  • 49
  • 73
SQLUser
  • 113
  • 8

3 Answers3

3

Collect the columns you want to delete in a variable ColumnsToDelete first and delete all of them at once after the loop. Advantage of that is you have only one delete action (each action takes time) so this is less time consuming. Also you don't need to deactivate screen updating or calculation with this because this is already optimized to run only one update/calculation.

Option Explicit

Public Sub delete_columns()
    Dim ws As Worksheet
    Set ws = ThisWorkbook.Worksheets("Sheet1")  ' adjust your sheet name here!

    Dim ColumnNames As Variant
    ColumnNames = Array("ID", "Status", "First_Name", "Last_Name")

    Dim LastColumn As Long
    LastColumn = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column

    Dim ColumnsToDelete As Range

    Dim iCol As Long
    For iCol = 1 To LastColumn  ' no need for backwards looping if we delete after loop.
        Dim MatchedAt As Double
        MatchedAt = 0

        On Error Resume Next  ' deactivate error reporting
        MatchedAt = WorksheetFunction.Match(ws.Cells(1, iCol), ColumnNames, 0)
        On Error Goto 0  'NEVER forget to re-activate error reporting!

        If MatchedAt > 0 Then
            If ColumnsToDelete Is Nothing Then  ' add first found column
                Set ColumnsToDelete = ws.Columns(iCol).EntireColumn
            Else   ' add all other found columns with union
                Set ColumnsToDelete = Union(ColumnsToDelete, ws.Columns(iCol).EntireColumn)
            End If
        End If
    Next mycol

    ' if columns were found delete them otherwise report
    If Not ColumnsToDelete Is Nothing Then
        ColumnsToDelete.Delete
    Else
        MsgBox "Nothing found to delete."
    End If
End Sub
Pᴇʜ
  • 56,719
  • 10
  • 49
  • 73
  • 1
    Methodically fine & fast approach +) Allow me a hint: IMO OP wants to **keep** the column names listed in `Array("ID", "Status", "First_Name", "Last_Name")`, *not* to *delete* them. Suffices to change the last condition to `If MatchedAt = 0 Then` instead of `... > 0 Then`. - fyi Might be interested in my approach getting column numbers to be maintained in *one go*. @PEH – T.M. Aug 26 '21 at 18:57
0

The first step would be to preface your Subroutine with

Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False

and end it with

Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True

This will mean that Excel doesn't try to recalculate the sheet every time you delete a column, it does it in one fell swoop at the end.

Unfortunately, we are working with Columns here, not Rows — otherwise, I'd suggest using a Filter to drop the Loop. Match can sometimes be a bit slow, so you may want to consider swapping the Array for a Dictionary, or having a Fuction to quickly loop through the Array and search for the value.

Not strictly a speed thing, but using Application.Match instead of WorksheetFunction.Match allows you to streamline your code inside the loop slightly:

If IsError(Application.Match(Cells(1, mycol).Value, Mylist, 0)) Then Columns(mycol).Delete
Chronocidal
  • 6,827
  • 1
  • 12
  • 26
0

Keep only columns occurring in titles array

"For my report, I have to remove a bunch columns and only keep about 50 of them, based on the header (in row 1)."

The slightly shortened code in OP only lists 4 of the 50 headers in array MyList ; thus following MCV E rules

In the following example code I demonstrate a way to approve performance, explained in several steps; in my tests it performed in 0.09 seconds over 3.000 rows (against nearly the same time of 0.10 seconds for @PEH 's methodically fine approach , but which imho should be changed to If MatchedAt = 0 Then instead of > 0 to include the listed columns, not to delete them!)

  • [1] Don't focus on deletion (~250 columns), but get an array of column numbers to be maintained (~4..50 columns); see details at help function getNeededColNums() showing an undocumented use of Application.Match()
  • [2] Hide the found columns to preserve them from eventual deletion
  • [3] Delete all columns left visible in one go using the SpecialCells method
  • [4] Redisplay the hidden columns left untouched

A main reason for the mentioned poor performance in the original post (OP) is that repeated deletion of columns shifts the entire worksheet up to 250 times (i.e. ~75% of titled columns).

A further note to the original post: always use Option Explicit to force variable declarations and fully qualify all range references, e.g. like x = Application.Match(Sheet1.Cells(1, mycol), myList, 0).

Sub ExampleCall()
    Dim t#: t = Timer
'[1]Get array of column numbers to be maintained
    Dim ws As Worksheet: Set ws = Sheet1    ' << reference wanted sheet e.g. by Code(Name)
    Dim cols: cols = getNeededColNums(ws)  '1-based 1-dim array
    Debug.Print Join(cols, ",")
'[2]Hide found columns to preserve them from eventual deletion
    Dim i As Long
    For i = 1 To UBound(cols)
        ws.Columns(cols(i)).Hidden = True
    Next
'[3]Delete columns left visible
    Application.DisplayAlerts = False
    ws.Range("A1", ws.Cells(1, LastCol(ws))).SpecialCells(xlCellTypeVisible).EntireColumn.Delete
    Application.DisplayAlerts = True
'[4]Redisplay untouched hidden columns
    ws.Range("A1", ws.Cells(1, UBound(cols))).EntireColumn.Hidden = False
    
    Debug.Print "**" & Format(Timer - t, "0.00 secs")  ' 0.09 seconds!
End Sub

'Help function getNeededColNums()

Note that Application.Match() doesn't compare only a single argument against a complete list of column titles, but is capable to pass even an array as first argument:

Application.Match(titles, allTitles, 0)

Assuming existing titles, this results in a 1-based array with the same dimension boundaries as the first argument and which returns the found column numbers. So you get valid list without need of further checks (IsNumeric or Not IsError in the late-bound Application form) or even error handling in the WorksheetFunction.

Function getNeededColNums(ws As Worksheet)
'Note: returns 1-based 1-dim array (assuming existant titles)
    Dim titles As Variant
    titles = Array("ID", "Status", "First_Name", "Last_Name")
    'get all existing titles
    Dim allTitles As Variant
    allTitles = ws.Range("1:1").Resize(1, LastCol(ws)).Value2
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'get column numbers to be maintained
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    getNeededColNums = Application.Match(titles, allTitles, 0)
End Function

Help function LastCol()

Function LastCol(ws As Worksheet, Optional rowNum As Long = 1) As Long
'Purp.: return the last column number of a title row in a given worksheet
    LastCol = ws.Cells(rowNum, ws.Columns.Count).End(xlToLeft).Column
End Function
T.M.
  • 9,436
  • 3
  • 33
  • 57