1

I have about 750 excel files. I need to clean them so that they all contain same formatting - i.e. so they all contain same number of columns.

Some files (80%) contain extra columns containing labels with an asterisk e.g. "*1 subject".

Is there a way using visual basic to go through all of the files in my folder to delete all columns containing an asterisk so that all files don't have any such columns? Will the fact that an asterisk is a wild card in computer speak make a difference?

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Tom Chambers
  • 364
  • 2
  • 4
  • 17
  • As i mentioned on your previous question ... the best way to do this is to record a macro that does what you want on a single file, and then modify it to work on multiple files. This is **the** best way to learn vba. The links i posted previously are also valid for this question ... http://stackoverflow.com/a/31403366/5040941. – 3-14159265358979323846264 Jul 14 '15 at 12:22
  • Hi thanks but the positioning of columns with asterisks changes per file (if they are even present) so what may work on a single file wouldnt necessarily work on another.... thanks for the links and your previous post though, i have looked through them... – Tom Chambers Jul 14 '15 at 12:42

1 Answers1

3

Write a macro that uses filesystemobjects to loop through directory where the spreadsheets are. Loop through each sheet and analyse the column names.

Here is how you would loop through each sheet.

Private Sub CommandButton7_Click()

Dim ws      As Excel.Worksheet
Dim iCol    As Integer
Dim strName As String
Dim iIndex  As Integer

    'Loop through the sheets.
    For iIndex = 1 To Application.Worksheets.Count
        Set ws = Application.Worksheets(iIndex)

        'Loop through the columns.
        For iCol = 1 To ws.UsedRange.Columns.Count
            'Check row 1 of this column for first char of *
            If Left(ws.Cells(1, iCol).Value, 1) = "*" Then
                'We have found a column with the first char of *
                ws.Columns(iCol).EntireColumn.Delete
            End If
        Next iCol

    Next iIndex
    ActiveWorkbook.SaveAs Filename:="C:\temp\newfiles\" & ActiveWorkbook.Name, FileFormat:=xlWorkbookNormal
End Sub

If you want to look for an * anywhere in the cell you could use instr()

Private Sub CommandButton7_Click()

Dim ws      As Excel.Worksheet
Dim iCol    As Integer
Dim strName As String
Dim iIndex  As Integer

    'Loop through the sheets.
    For iIndex = 1 To Application.Worksheets.Count
        Set ws = Application.Worksheets(iIndex)

        'Loop through the columns.
        For iCol = 1 To ws.UsedRange.Columns.Count
            'Check row 1 of this column for the char of *
            If instr(ws.Cells(1, iCol).Value, "*") > 0 Then
                'We have found a column with the char of *
                ws.Columns(iCol).EntireColumn.Delete
            End If
        Next iCol

    Next iIndex
    ActiveWorkbook.SaveAs Filename:="C:\temp\newfiles\" & ActiveWorkbook.Name, FileFormat:=xlWorkbookNormal
End Sub

Here is a basic loop files in a given directory. Hope this gets you there.

Private Sub CommandButton7_Click()
    Dim wb      As Workbook
    Dim ws      As Excel.Worksheet
    Dim iCol    As Integer
    Dim strName As String
    Dim iIndex  As Integer
    Dim strPath As String
    Dim strFile As String

    strPath = "c:\temp\oldfiles\"
    strFile = Dir(strPath & "*.xlsx")

    Do While strFile <> ""

        Set wb = Workbooks.Open(Filename:=strPath & strFile)

        'Loop through the sheets.
        For iIndex = 1 To Application.Worksheets.Count
            Set ws = Application.Worksheets(iIndex)

            'Loop through the columns.
            For iCol = 1 To ws.UsedRange.Columns.Count
                'Check row 1 of this column for the char of *
                If InStr(ws.Cells(1, iCol).Value, "*") > 0 Then
                    'We have found a column with the char of *
                    ws.Columns(iCol).EntireColumn.Delete
                End If
            Next iCol

        Next iIndex
        wb.SaveAs Filename:="C:\temp\newfiles\" & wb.Name, FileFormat:=xlOpenXMLWorkbook
        wb.Close SaveChanges:=False
        strFile = Dir
    Loop

End Sub
MatthewD
  • 6,719
  • 5
  • 22
  • 41
  • that's excellent - thank you. is there a way to also adapt it to look through subfolders? would this then save the file again and close once columns formatted? – Tom Chambers Jul 14 '15 at 12:45
  • 1
    Added a saveas to retain originals and create new files in a defined location with the columns deleted. Subfolders is possible but a bit of work. if reasonable do one folder at a time or move them into one folder. – MatthewD Jul 14 '15 at 12:58
  • Here is a good example of looping though files in a folder. Let me know if you need help adapting this to the above code. http://stackoverflow.com/questions/5851531/code-for-looping-through-all-excel-files-in-a-specified-folder-and-pulling-data – MatthewD Jul 14 '15 at 13:00
  • Hi MatthewD, I tried running the above code, and after removing Set statements and addings parantheses after SaveAs and Close, i stopped getting errors on the script but instead simply get an output msg of: Value cannot be null. Parameter name: solutionDirectory – Tom Chambers Jul 17 '15 at 17:31
  • At what line are you getting the Value cannot be null error? – MatthewD Jul 17 '15 at 18:03
  • ok, i'm getting totally freaked out and confused. I have never used Visual Basic before. I therefore couldnt work out how to actually run the code you kindle gave me. I therefore downloaded and installed Visual Studio. I've tried setting up a console project but get errors. I then read that there is a set of templates for Excel in Visual Studio.... I just want to run the script once to (i) clean all of the files, and then (ii) to do some further manipulation where i copy and paste some of the date in the new cleaned files and output them to two separate workbooks...how do i start...? – Tom Chambers Jul 21 '15 at 15:38
  • Well Visual Studio is a whole other beast. Visual studio today is what is referred to as .NET. If you go that way you will have to use office.interop references which can be a lot to learn for a beginner. VBA is more like VB from Visual studio 6 from 15 years ago. I would stick with VBA. Start from the basics. Learn how loops work, how to open workbooks, etc. Then some of the above will make more sense. – MatthewD Jul 21 '15 at 15:46
  • Thanks MatthewD. How do i just go about using your code then? I put it in a file a name it whatever.vbs and then navigate to that directory through console, but then how do i actually run it - doesnt it need to be a .exe file?? – Tom Chambers Jul 22 '15 at 07:27
  • No, VBA is meant to be run from within a program. Excel has an integrated development environment (IDE) for development of the code. Then it can be executed from the macro tools on the developers tab of the tool ribbon. To open the IDE, right click on a worksheet tab at the bottom of excel, let's say "sheet1". Select "View Code" from the context menu. That will open the IDE. From here you can create forms or modules or add code to the sheets. You'll have to put in the work to learn a bit. I'll put some links here in a minute. It's not hard if you spend a little time with the basics. – MatthewD Jul 22 '15 at 12:28
  • This one is ten years old but the basics have not changed, only where the buttons are in excel. http://nomish.yolasite.com/resources/EXCEL%20VBA%20programing.pdf – MatthewD Jul 22 '15 at 12:35
  • Turn on the developer tab. https://msdn.microsoft.com/en-us/library/bb608625.aspx – MatthewD Jul 22 '15 at 12:35
  • One piece of advice. People say it's best practice to avoid using .select. It is fine to learn to use it. But after you learn how to do things using it, learn how not to use it. See the first answer on this SO post. That is my general style. http://stackoverflow.com/questions/10714251/how-to-avoid-using-select-in-excel-vba-macros – MatthewD Jul 22 '15 at 12:47