0

I really appreciate the time you're taking to read my message, and I understand that Stack overflow's purpose is more for codes that are failing, however I am new and would like to get some of your tips VBA Question: I've seen similar questions asked about long run time when using the file search object (vs. using the directory function). In my case, my run time is over > 3 hours as I am trying to loop through 1000s of subfolders and 100s of files in each one of those subfolders. I am not sure how to apply the answers I read online to the specific code I am using as I have to loop through different subfolders of a folder. Question Edited: I would like to lower the run time of the macro. I believe that the issue here is that the FSO is looping through a lot of subfolders and files that are not meeting the criteria (filename and date). How can I reduce the runtime to avoid the macro to run through all those folders and files? Code purpose: copy/paste two columns from all the "results" files in all the subfolders from Jan 1,2019 to Jan 1, 2020 to the active workbook. Thank you so much for your help,

Please see below my code :

Sub LoopAllSubFolders(FSOFolder As Object)
Dim R0 As Range, R1 As Range, R2 As Range, R3 As Range, R4 As Range, RN0 As Range, RN1 As Range, R5 As Range, RN2 As Range, RN3 As Range
Dim FSOSubFolder As Object
Dim FSOFile As Object
Dim FSOFilepath As String
Dim wb As Workbook
Dim sspec As String
Dim DateY As Date
Dim DateW As Date

'For each subfolder, macro is called'
For Each FSOSubFolder In FSOFolder.SubFolders
DateY = DateSerial(2019, 1, 1)
DateW = DateSerial(2020, 1, 1)
If FSOSubFolder.DateLastModified > DateY Then
If FSOSubFolder.DateLastModified < DateW Then

    LoopAllSubFolders FSOSubFolder

    End If
    End If
Next


For Each FSOFile In FSOFolder.Files
sspec = "Results"
Application.ScreenUpdating = False
Application.DisplayAlerts = False
FSOFilepath = FSOFile.Path
If Right(FSOFilepath, 3) = "csv" Then
If InStr(FSOFilepath, sspec) > 0 Then
If FSOFile.DateLastModified > DateY Then
If FSOSFile.DateLastModified < DateW Then

Set wb = Workbooks.Open(FSOFile.Path)
Set R0 = wb.Sheets(1).Cells(2, 1)
Set R1 = R0.End(xlDown)
Set R2 = Range(R0, R1)
Set R3 = wb.Sheets(1).Cells(2, 2)
Set R4 = R3.End(xlDown)
Set R5 = Range(R3, R4)


Set RN0 = ThisWorkbook.Sheets(1).Cells(1, 1)
Set RN1 = RN0.End(xlDown)
Set RN2 = ThisWorkbook.Sheets(1).Cells(1, 2)
Set RN3 = RN2.End(xlDown)



wb.Sheets(1).Activate
R2.Select
Selection.Copy
ThisWorkbook.Activate
RN0.Select
RN1.Offset(1, 0).Select
ActiveSheet.Paste

wb.Sheets(1).Activate
R5.Select
Selection.Copy
ThisWorkbook.Activate
RN3.Offset(1, 0).Select
ActiveSheet.Paste

wb.Close
Application.CutCopyMode = False
End If
End If
End If
End If
Next FSOFile
ThisWorkbook.Activate
ThisWorkbook.Save

End Sub

Sub loopAllSubFolderSelectStartDirectory()

Dim FSOLibrary As Object
Dim FSOFolder As Object
Dim folderName As String
Dim fileName As String

Application.ScreenUpdating = False
Application.DisplayAlerts = False
ThisWorkbook.Activate

Range("A1").Value = "ID"
Range("A2").Value = "ID"
Range("B1").Value = "Value"
Range("B2").Value = "Value"


'Set the folder name to a variable
folderName = "\\pah1\path2\"

'Set the reference to the FSO Library
Set FSOLibrary = CreateObject("Scripting.FileSystemObject")


'Another Macro call LoopAllSubFolders Macro to start
LoopAllSubFolders FSOLibrary.GetFolder(folderName)


Application.ScreenUpdating = True

ThisWorkbook.Activate
Rows(2).EntireRow.Delete


End Sub
Tim Williams
  • 154,628
  • 8
  • 97
  • 125
  • 1
    [Do not use `.Select` or `.Activate`](https://stackoverflow.com/questions/10714251/how-to-avoid-using-select-in-excel-vba) – Scott Craner Mar 04 '20 at 23:51
  • Use `Value Transfer` instead of `Copy & Paste`. Your question title sounds like you want to increase the run time lol – urdearboy Mar 04 '20 at 23:54
  • Thank you, do I just not use those and the code will still operate correctly or should I substitute them with something else? –  Mar 04 '20 at 23:55
  • @user13009116 - the link Craner shared shows you how to implement. Your code will run more efficiently and will be utilizing 'best practices'. **"Correct" code can still take hours** but removing spread sheet operations (such as unnecessarily selecting or activating objects will definitely help just like using value transfers instead of copy & paste will help). – urdearboy Mar 04 '20 at 23:56
  • Thank you, I didn't see there was a link. I am going to check this. However I believe that the longest steps in the code are looping through all the folders/subfolders and files and not actually pasting the data –  Mar 04 '20 at 23:57
  • There are two main activities going on here, accessing the file system, and manipulating Excel. To focus your efforts on the slowest area, I'd test it by removing all the excel stuff and run just the fso stuff. That will tell you where most of the time is going. That said there are huge opportunities for improvement in the Excel stuff as others have told you. – chris neilsen Mar 05 '20 at 00:04
  • BTW, are you really saying you have 1,000,000's of files to process? – chris neilsen Mar 05 '20 at 00:05
  • What is the objective of your code - in your own words, plain English. I always find it a bit awkward to be given a piece of code that doesn't do what its owner wants with the comment "that's what I want to do". – Variatus Mar 05 '20 at 00:06
  • Chris Neilsen, yes a lot of data. not all the data are processed though out of this huge number : we're only interested in the ones meeting the date and file name criteria. I've run them separately and I believe that the issue here is that the fso is looping through a lot of subfolders and files that are not meeting the criteria. How can I reduce the runtime to avoid the macro to run through all those folders and files ? –  Mar 05 '20 at 01:10
  • Variatus, the objective of the code is to copy 2 different columns in the active workbook. Those columns are copied from all the "results" files in all the subfolders from Jan1, 2019 to Jan 1,2020 in path2 folder. –  Mar 05 '20 at 01:16
  • Basically I have 1000s of subfolders in a folder. I am only interested in 100s (from Jan 1 2019 to Jan 1 2020. In each folder I have 100 files and I am only interested in the Results file. –  Mar 05 '20 at 01:28
  • So in a given run what would be a typical number of files which actually meet your criteria? – Tim Williams Mar 05 '20 at 01:31
  • Typical number of files would be 100. –  Mar 05 '20 at 01:34
  • Probably quicker to use `Dir(folderPath & "*results*.csv")` to find the files of interest than to use the `Folder.Files` loop and FSO. – Tim Williams Mar 05 '20 at 01:42
  • would that work even though the results file is in another subfolder and therefore does not have exactly the same folder path –  Mar 05 '20 at 01:44
  • For input, why don't you copy or move all your files to one folder? Alternatively, consider copying all file names to one array that you can search. You might sort out files that aren't of interest even in the process of creating a searchable folder or array. For output, consider writing everything to an array that you paste to a worksheet when done. – Variatus Mar 05 '20 at 01:53
  • @Variatus, hi and thank you, I am considering to do that but the time to copy the files to one folder is as long, (very heavy files - copy time would be a few hours). For the array comment, I am not sure to understand how I would do that... Sorry I am a beginner :) –  Mar 05 '20 at 01:59
  • No you still need to navigate the subfolders - just switch out the file loop. – Tim Williams Mar 05 '20 at 04:18

2 Answers2

0

Here is a solution using arrays.

Option Explicit

Const StartDate As Date = #1/1/2019#            ' inclusive
Const EndDate As Date = #12/31/2019#            ' inclusive

Private Sub Test()

    Dim Arr() As String
    Dim i As Long

    Arr = ListOfFiles
    For i = 1 To UBound(Arr)
        Debug.Print i, Arr(i)
    Next i

    With ActiveSheet
        .Cells(1, "B").Resize(UBound(Arr)).Value = Application.Transpose(Arr)
    End With
End Sub

Function ListOfFiles() As String()
    ' code by:
    ' https://stackoverflow.com/questions/14245712/cycle-through-sub-folders-and-files-in-a-user-specified-root-directory
    ' modified for this project:
    ' https://stackoverflow.com/questions/60536325/lower-run-time-currently-3-hours-vba-loop-through-specific-subfolders?noredirect=1#comment107097419_60536325
    ' by Variatus @STO 05 Mar 2020

    ' set the start directory as required
    Const StartDir As String = "F:\AWK PC\Drive E (Archive)\PVT Archive\"

    Dim Fun() As String                     ' function return array
    Dim ArrIdx As Long
    Dim RootDir As String
    Dim Fso As FileSystemObject
    Dim FirstFld As Folder
    Dim Fld As Folder
    Dim Fltr As String

    With Application.FileDialog(msoFileDialogFolderPicker)
        .InitialFileName = StartDir
        .AllowMultiSelect = False
        If Not .Show Then
            MsgBox "No folder selected!" & vbCr & _
                   "Exiting script.", vbInformation, "Cancel action"
            Exit Function
        End If
        RootDir = .SelectedItems(1)
    End With

    ReDim Fun(1 To 10000)   ' allow a number of files larger than expected
                            ' it's important to start at 1
    ArrIdx = 0
    Set Fso = New FileSystemObject
    Set FirstFld = Fso.GetFolder(RootDir)
    Fltr = ".cvs"
    ListFiles FirstFld, Fltr, Fun, ArrIdx

    For Each Fld In FirstFld.SubFolders
        ListFiles Fld, Fltr, Fun, ArrIdx
        ListFolders Fld, Fltr, Fun, ArrIdx
    Next Fld

    ReDim Preserve Fun(1 To ArrIdx)
    ListOfFiles = Fun
    Application.StatusBar = "Done"
End Function


Sub ListFolders(FirstFld As Folder, _
                Fltr As String, _
                Fun() As String, _
                Idx As Long)

    Dim Fld As Folder

    For Each Fld In FirstFld.SubFolders
        ListFiles Fld, Fltr, Fun, Idx
        ListFolders Fld, Fltr, Fun, Idx
    Next Fld
End Sub

Sub ListFiles(Fld As Folder, _
              Fltr As String, _
              Fun() As String, _
              Idx As Long)

    Dim ModDate As Date
    Dim Fil As File

    For Each Fil In Fld.Files
        ' exclude temporary files marked with ~ by the system
        With Fil
            If (Right(.Name, 4) = Fltr) And (Asc(.Name) <> 126) Then
                ModDate = Fil.DateLastModified
                ' skip files not within date range
                If (ModDate >= StartDate) And (ModDate <= EndDate) Then
                    Idx = Idx + 1
                    Fun(Idx) = Fld.Path & "\" & .Name
                    If Idx Mod 50 = 1 Then Application.StatusBar = Idx & " files copied."
                End If
            End If
        End With
    Next Fil
End Sub

There are 3 constants to set, StartDate and EndDate at the top of the code sheet and the StartDir in the procedure ListOfFiles. If you don't set the latter the Folderpicker will start in the directory you last used. I also recommend to change the reference to ActiveSheet in Sub Test to point to a blank sheet that you insert in your workbook for testing purposes.

When you're all set run the Test procedure. It will call the function ListOfFiles which goes through all the specified folders and subfolders and returns an array of qualified file names. This list the Test procedure first prints to the Immediate Window and then to column B of the blank worksheet mentioned above. This will give you an idea of what you have and what might be done with it. Your testing should include a check on whether the first and last qualifying files are included in the array and the lists. It's a very popular programming error to cut them off and my testing was limited to the code not crashing.

I tested with about 300 files, extracting 71 of them and it took something like 3 seconds. By that measure your list should be ready in under 2 minutes. There is a progress indicator in the Status Bar.

I don't understand what you want to do with the files but if your intention is to extract data from them please note that you need not necessarily open them for that purpose. I feel that the best way to extract data from a closed CSV file is not within the scope of your present question.

Variatus
  • 14,293
  • 2
  • 14
  • 30
0

Non-recursive approach using Dir():

Sub Tester()
    Dim f
    For Each f In GetFiles("C:\My\Stuff\Analysis\")
        Debug.Print f
        'extract your data
    Next f
End Sub

Function GetFiles(startPath As String) As Collection 'of file paths
    Dim fso As Object, rv As New Collection, colFolders As New Collection
    Dim fPath As String, subFolder As Object, f, dMin, dMax, dtMod

    Set fso = CreateObject("Scripting.FileSystemObject")

    dMin = DateSerial(2019, 1, 1)
    dMax = DateSerial(2020, 1, 1)

    colFolders.Add startPath

    Do While colFolders.Count > 0
        fPath = colFolders(1)
        colFolders.Remove 1
        'process subfolders
        For Each subFolder In fso.getfolder(fPath).subfolders
            dtMod = subFolder.DateLastModified
            If dtMod > dMin And dtMod < dMax Then
                colFolders.Add subFolder.Path
            End If
        Next subFolder
        'process files
        f = Dir(fso.buildpath(fPath, "*Results*.csv"), vbNormal)
        Do While f <> ""
            f = fso.buildpath(fPath, f)
            dtMod = FileDateTime(f)
            If dtMod > dMin And dtMod < dMax Then
                rv.Add f
            End If
            f = Dir()
        Loop
    Loop
    Set GetFiles = rv
End Function

Looks like you're using a network file share, so it's possible the poor performance could be due in part to working with a non-local drive.

Tim Williams
  • 154,628
  • 8
  • 97
  • 125