0

This is my first post - I hope it is a good one :) A small task for home, is I would like an array of all the file paths in a folder (and it's sub folders) but only for PDF or a file type that I tell it to filter on.

I like arrays better (and it can write to a Range very quickly), I know I can convert my first example of code from a collection to an Array but I would like to learn and understand the logic / syntax of how to implement my example 1 but using Arrays only.

Example 1 works (I left out the other bit of code I use to Debug.Print it) :

Sub GetAllFilePaths(StartFolder As String, Pattern As String, _
             ByRef colFiles As Collection)

    Dim f As String, sf As String, subF As New Collection, S

    If Right(StartFolder, 1) <> "\" Then StartFolder = StartFolder & "\"

    f = Dir(StartFolder & Pattern)
    Do While Len(f) > 0
        colFiles.Add StartFolder & f
        f = Dir()
    Loop

    sf = Dir(StartFolder, vbDirectory)
    Do While Len(sf) > 0
        If sf <> "." And sf <> ".." Then
            If (GetAttr(StartFolder & sf) And vbDirectory) <> 0 Then
                    subF.Add StartFolder & sf
            End If
        End If
        sf = Dir()
    Loop

    For Each S In subF
        GetAllFilePaths CStr(S), Pattern, colFiles
    Next S

End Sub

Example 2 doesn't quite work, it seems to loop in how I want it but overwrites the array not adds to it, so doesn't get all the PDF files I know is in the deep sub folders.

I think it is the way I handled adding to the array, the resizing and at which index I add the new value, I have looked.. everywhere for some help even here Recursive search of file/folder structure, https://excelvirtuoso.net/2017/02/07/multi-dimensional-arrays/, VBA macro that search for file in multiple subfolders,

I know the logic isn't right in the bit but cant seem to figure out it out, any help please..

Example 2 code (I put in how I call it and use Debug.Print to test it):

Option Explicit
Sub GetAllFilePaths(StartFolder As String, Pattern As String, ByRef allFilePaths As Variant, ByRef allFileNames As Variant)
    Dim FNum As Integer
    Dim mainFolder As Object
    Dim pathFile As String
    Dim subFoldersRecurs As New Collection, SubPath
    Dim SubFilePath As String

    If Right(StartFolder, 1) <> "\" Then StartFolder = StartFolder & "\"

    pathFile = Dir(StartFolder & Pattern)
    Do While Len(pathFile) > 0
        FNum = FNum + 1
        ReDim Preserve allFileNames(1 To FNum)
        ReDim Preserve allFilePaths(1 To FNum)
        allFileNames(FNum) = pathFile
        allFilePaths(FNum) = StartFolder & pathFile
        pathFile = Dir()
    Loop


    SubFilePath = Dir(StartFolder, vbDirectory)
    Do While Len(SubFilePath) > 0
        If SubFilePath <> "." And SubFilePath <> ".." Then
            If (GetAttr(StartFolder & SubFilePath) And vbDirectory) <> 0 Then
                subFoldersRecurs.Add StartFolder & SubFilePath
            End If
        End If
        SubFilePath = Dir()
    Loop

    For Each SubPath In subFoldersRecurs
        GetAllFilePaths CStr(SubPath), Pattern, allFilePaths, allFileNames
    Next SubPath

End Sub

Sub PDFfilestoCollall()
Dim pdfFilePaths() As Variant
Dim pdfFileNames() As Variant

Call GetAllFilePaths("C:\Users\adg\Downloads\test folder of files for ingest\", "*.PDF", pdfFilePaths, pdfFileNames)

Dim CollEntry As Variant
For Each CollEntry In pdfFilePaths
    Debug.Print CollEntry

Thanks, ADG

ADG
  • 3
  • 2

1 Answers1

0

I've refactored you code here.

Sub GetAllFilePaths(ByVal StartFolder As String, ByVal Pattern As String, _
    ByRef arrFiles() As String, Optional ByRef AddToArrayAt As Long = -1)

    Dim f As String
    Dim sf As String
    Dim subF As Collection
    Dim S
    Dim AddedFiles As Boolean

    If Right(StartFolder, 1) <> "\" Then StartFolder = StartFolder & "\"
    If AddToArrayAt < 0 Then AddToArrayAt = LBound(arrFiles)

    f = Dir(StartFolder & Pattern)
    Do While Len(f) > 0
        AddedFiles = True
        If AddToArrayAt > UBound(arrFiles) Then ReDim Preserve arrFiles(LBound(arrFiles) To UBound(arrFiles) + 100)
        arrFiles(AddToArrayAt) = StartFolder & f
        AddToArrayAt = AddToArrayAt + 1
        f = Dir()
    Loop
    If AddedFiles Then ReDim Preserve arrFiles(LBound(arrFiles) To AddToArrayAt - 1)

    Set subF = New Collection
    sf = Dir(StartFolder, vbDirectory)
    Do While Len(sf) > 0
        If sf <> "." And sf <> ".." Then
            If (GetAttr(StartFolder & sf) And vbDirectory) <> 0 Then
                subF.Add StartFolder & sf
            End If
        End If
        sf = Dir()
    Loop

    For Each S In subF
        GetAllFilePaths CStr(S), Pattern, arrFiles, AddToArrayAt
    Next S

End Sub

Sub test()
    Dim pdfFileNames() As String

    ReDim pdfFileNames(1 To 100)
    GetAllFilePaths "C:\Data\", "*.PDF", pdfFileNames

    Dim i As Long
    For i = LBound(pdfFileNames) To UBound(pdfFileNames)
        Debug.Print pdfFileNames(i)
    Next

End Sub

Couple of points to note:

  • I'm Redim Preserve'ing the arrFiles array in lots of 100 because this operation is quite slow
  • I've retained a Collection internally for the folders loop, as it's quite convenient and not exposed to the calling routine
  • I've not studied your Dir's so I make no claims on their efficacy or efficiency
chris neilsen
  • 52,446
  • 10
  • 84
  • 123
  • Thanks very much, I had thought I needed to pass by ref the counter so not to overwrite - really glad now see the sorts of checks and balances logic on how to use it in the array. Question: so without changing the code (pdfFileNames(1 to 100)).. If I had 102 PDF files it does in bundles of 100? Or will it continue to increase, it will just pre-making the size of the array. Sorry if the question is a bit silly I hope to be 100 % clear of what you mean. I am going to test it out soon anyway :) – ADG May 08 '18 at 05:43
  • I initialized the array 1 to 100, then in `GetAllFilePaths`expanded it in increments of 100 as needed. So the code will handle any number of files – chris neilsen May 08 '18 at 09:58