0

I have many documents in a folder and a similar but different list in an Excel file. The documents in the folder are not always name correctly, but the value in one of the cells has the accurate name.

END GOAL: what I want to do is have code that runs through that folder, opens each file, looks at the file name in a cell*(code for that part below)* and compare it to Column A in the other Excel file, ACTIVE_FILES.xls. If it is in the list, it will move on to the next file. If it is not in the list, it will delete that file from the folder.

I already have working code which loops though a folder to open files and output information from them. I just do not know how to do a comparisson to a separate Excel worksheet or how to delete a file from a folder if it is not present.

CURRENT CODE:

This is how my current code starts out with looping through the folder (hard coded into MyFolder) to open files:

Option Explicit

Sub Active()


Sub LoopThroughDirectory()

    Dim objFSO As Object, objFolder As Object, objFile As Object, dict As Object
    Dim MyFolder As String
    Dim StartSht As Worksheet, ws As Worksheet
    Dim WB As Workbook
    Dim i As Integer
    Dim hc As Range, hc1 As Range, hc2 As Range, hc3 As Range, hc4 As Range, d As Range, TDS As Range

    Set StartSht = Workbooks("masterfile.xlsm").Sheets("Sheet1")
    'turn screen updating off - makes program faster
    Application.ScreenUpdating = False

    'location of the folder in which the desired TDS files are
    MyFolder = "C:\Users\trembos\Documents\TDS2\progress\"

    'find the header
    Set hc4 = HeaderCell(StartSht.Range("A1"), "TOOLING DATA SHEET (TDS):")

    'create an instance of the FileSystemObject
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    'get the folder object
    Set objFolder = objFSO.GetFolder(MyFolder)
    i = 2
    'loop through directory file and print names
'(1)
    'code for every excel file in the specified folder
    For Each objFile In objFolder.Files
        If LCase(Right(objFile.Name, 3)) = "xls" Or LCase(Left(Right(objFile.Name, 4), 3)) = "xls" Then
'(2)
            'Open folder and file name, do not update links
            Set WB = Workbooks.Open(Filename:=MyFolder & objFile.Name, UpdateLinks:=0)
            Set ws = WB.ActiveSheet

Then, this is how I grab the cell value which contains the file name I am looking for

(searches for header "TOOLING DATA SHEET (TDS):" and then grabs the value of the cell to the right of that header cell. In my previous code, it would then print it to the next available row in column C which is no longer needed but I kept in to show my GetLastRowInColumn function which could help search through column A in the plan I want to execute)

With ws
'Print TDS name by searching for header
    If Not ws.Range("A1:K1").Find(What:="TOOLING DATA SHEET (TDS):", LookAt:=xlWhole, LookIn:=xlValues) Is Nothing Then
        Set TDS = ws.Range("A1:K1").Find(What:="TOOLING DATA SHEET (TDS):", LookAt:=xlWhole, LookIn:=xlValues).Offset(, 1)
        StartSht.Range(StartSht.Cells(i, 1), StartSht.Cells(GetLastRowInColumn(StartSht, "C"), 1)) = TDS
    Else                
    End If
    i = GetLastRowInSheet(StartSht) + 1
End With

And finally, here are my functions which help make it all possible. Numbers designate a new function and there is an explanation next to each one.

'(8)
'Get the Values from columns with specified headers
Function GetValues(ch As Range, Optional vSplit As Variant) As Scripting.Dictionary
    Dim dict As Scripting.Dictionary
    Dim dataRange As Range, cell As Range
    Dim theValue As String
    Dim splitValues As Variant
    Dim counter As Long
Set dict = New Scripting.Dictionary
Set dataRange = ch.Parent.Range(ch, ch.Parent.Cells(Rows.count, ch.Column).End(xlUp)).Cells
' If there are no values in this column then return an empty dictionary
' If there are no values in this column, the dataRange will start at the row
' *above* ch and end at ch
If (dataRange.Row = (ch.Row - 1)) And (dataRange.Rows.count = 2) And (Trim(ch.Value) = "") Then
    GoTo Exit_Function
End If
For Each cell In dataRange.Cells
    counter = counter + 1
    theValue = Trim(cell.Value)
    If Len(theValue) = 0 Then
        theValue = " "
    End If
        'exclude any info after ";"
        If Not IsMissing(vSplit) Then
            splitValues = Split(theValue, ";")
            theValue = splitValues(0)
        End If
        'exclude any info after ","
        If Not IsMissing(vSplit) Then
            splitValues = Split(theValue, ",")
            theValue = splitValues(0)
        End If
        If Not dict.exists(theValue) Then
        dict.Add counter, theValue
        End If
Next cell
Exit_Function:
Set GetValues = dict
 End Function
'(9)
'find a header on a row: returns Nothing if not found
Function HeaderCell(rng As Range, sHeader As String) As Range
    Dim rv As Range, c As Range
    For Each c In rng.Parent.Range(rng, rng.Parent.Cells(rng.Row, Columns.count).End(xlToLeft)).Cells
        'copy cell value if it contains some string "holder" or "cutting tool"
        If Trim(c.Value) = sHeader Then
        'If InStr(c.Value, sHeader) <> 0 Then
            Set rv = c
            Exit For
        End If
    Next c
    Set HeaderCell = rv
End Function
'(10)
'gets the last row in designated column
Function GetLastRowInColumn(theWorksheet As Worksheet, col As String)
    With theWorksheet
        GetLastRowInColumn = .Range(col & .Rows.count).End(xlUp).Row
    End With
End Function
'(11)
'gets the last row in designated sheet
Function GetLastRowInSheet(theWorksheet As Worksheet)
Dim ret
    With theWorksheet
        If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
            ret = .Cells.Find(What:="*", _
                          After:=.Range("A1"), _
                          LookAt:=xlPart, _
                          LookIn:=xlFormulas, _
                          SearchOrder:=xlByRows, _
                          SearchDirection:=xlPrevious, _
                          MatchCase:=False).Row
        Else
            ret = 1
        End If
    End With
    GetLastRowInSheet = ret
End Function

EDIT TO SHOW NEW WORK

POTENTIAL CODE 1: moving unwanted files to another folder - not working, basic outline because I do not know how to compare what I stated above to test a run

Option Explicit
' 33333

Sub Activate()

    Dim objFSO As Object, objFolder As Object, objFile As Object, dict As Object
    Dim MyFolder As String
    Dim StartSht As Worksheet, ws As Worksheet
    Dim WB As Workbook, wbkA As Workbook
    Dim row As Long, col As Long
    Dim LastRow As Long
    Dim TDS1 As Object




    Dim i As Integer
    Dim hc As Range, hc1 As Range, hc2 As Range, hc3 As Range, hc4 As Range, d As Range, TDS As Range

    Set StartSht = Workbooks("Active.xlsm").Sheets("Sheet1")
    'turn screen updating off - makes program faster
'    Application.ScreenUpdating = False

    'location of the folder in which the desired TDS files are
    MyFolder = "C:\Users\trembos\Documents\TDS2\progress_test\"

    'find the headers on the sheet
    Set hc4 = HeaderCell(StartSht.Range("A1"), "TOOLING DATA SHEET (TDS):")

    'create an instance of the FileSystemObject
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    'get the folder object
    Set objFolder = objFSO.GetFolder(MyFolder)
    i = 2
    'loop through directory file and print names
'(1)
    'code for every excel file in the specified folder
    For Each objFile In objFolder.Files
        If LCase(Right(objFile.Name, 3)) = "xls" Or LCase(Left(Right(objFile.Name, 4), 3)) = "xls" Then

        Set wbkA = Workbooks.Open(FileName:="C:\Users\trembos\Documents\TDS2\TDS_ACTIVE_FILES.xls")

        For row = 1 To LastRow
            With WB
                If wbkA.Cells(row, 1).Value <> GetFilenameWithoutExtension(objFile.Name) Then
                ElseIf row = LastRow And wbkA.Cells(row, col) <> TDS.Value Then
                    StartSht.Cells(i, 1) = GetFilenameWithoutExtension(objFile.Name)
                    i = GetLastRowInSheet(StartSht) + 1
                End If
            End With
        Next

    End If
    Next
user4888
  • 101
  • 1
  • 12
  • Once you opened a file you grab that value. If you store that value in an object you can loop through your ACTIVE_FILES list and look for NOT a match. If that is the case you can delete the file, else you close the file and open the next one. – Luuklag Jul 31 '15 at 12:29
  • @Luuklag perfect! that is exactly what I am trying to do. If you look at my code I find the value and set it under the name TDS and then call it as TDS.Value, is that what you mean by store the value? My biggest issue is figuring out how to compare them so I do have two attempts above of trying to set the ACTIVE_FILES workbook under something and then I do not know how to take the store value and search through an entire column of ACTIVE_FILES. any ideas? – user4888 Jul 31 '15 at 12:47

2 Answers2

1

You can set the workbook ACTIVE_FILES is in as a workbook object. So perhaps you call it WBREF, and also name the worksheet ACTIVE_FILES as a worksheet object, like WSREF. Then you can code something like:

For row = 1 to LastRow 
    IF WBREF.WSREF.Cells(row, *# of column in which your data is*). Value = TDS.Value Then 
        * close file* 
        Exit For 
    ElseIf row = LastRow And WBREF.WSREF.Cells(row,col) <> TDS.Value THEN 
        code how to delete file
    End If
Next row

Edit: Let me explain what this code does: For all rows in column 1 (you should code that LastRow, just search for it on this site and you will find how to do that) it checks if the content of the cell matches the value of TDS. If it finds a match it closes the file and stops looking. If the first row is not a match, it moves to the second row etc. etc. If it arrives at the last row (this is the part of code after ElseIf) and this row is also not a match you code here how to delete the file.

So you would need to place this loop of code within the loop that you have that extracts the TDS, right after that it needs to run this, before it moves on to the next TDS.

Luuklag
  • 3,897
  • 11
  • 38
  • 57
  • Just for clarification, the cell in the opening file is TDS. What I am comparing that value to is one worksheet titled ACTIVE_FILES. My goal is to look at the TDS.Value, see if it is present in column 1 of ACTIVE_FILES, if yes, move on, if no, delete the actual file that is opened, not the file name from the ACTIVE_FILES sheet. Is that what this code is doing? Because it seems to be working the opposite way and I just wanted to clarify. Also, would the .Cells(row, 1) look through all of column 1 until the last occupied row to see if there is a match? There are at least 4000 rows @Luuklag – user4888 Jul 31 '15 at 13:25
  • I am also unsure as to why you would save ACTIVE_FILES as both a workbook object and a worksheet object. Could you explain that to me? In my First Attempt section, I have it saved as wbkA, is that what you would advise? – user4888 Jul 31 '15 at 13:27
  • Well not save as both, but save the workbook that contains that worksheets as an object, and that worksheet itself. Then it is easier to refer to. Otherwise you will have to refer to it using: `Workbooks("name of workbook").Worksheets("ACTIVE_FILES")` – Luuklag Jul 31 '15 at 13:29
  • Maybe I am confusing the meaning of workbook and worksheet. ACTIVE_FILES is just an excel document that has a list of files names down the page (with some other superfluous info in the columns next to it) – user4888 Jul 31 '15 at 13:33
  • then your Workbook is called ACTIVE_FILES.xlsx? and then the worksheet is called Sheet1 ? – Luuklag Jul 31 '15 at 13:34
  • I'm sorry, I did not omit that so I have not been explaining it well. I will edit my code. I hope this explanation makes more sense. I'm very sorry for the confusion due to my poor explantion before. Okay so I have one normal Excel file with a list of about 4000 names. I have created a program which loops through a folder and gets the name of each file from a specific cell. What I am trying to do is see if the name it gets from the file in the folder matches any name in that list of 4000. If it does, it can close and next. If it does not match, I want it to delete the file from the folder – user4888 Jul 31 '15 at 13:40
  • I have edited my question so that the last section of code is what I have put together based off of our discussion. Does this look like what you had in mind? (I changed 'delete' to moving files to a new folder) – user4888 Jul 31 '15 at 13:48
  • It's almost there. But you need to replace the else with ElseIf as I have done in my answer, otherwise it will only look at the first row of the list. – Luuklag Jul 31 '15 at 14:10
  • Okay, I tried it out and, when stepping through it, when it gets to the line `For row = 1 To LastRow`, its next move jumps to `End If` of that whole section starting with `If Not ws.Range`. Do you know what I am doing wrong? @Luuklag – user4888 Jul 31 '15 at 14:27
  • Yes, you have next, right after If. So If your if statement returns a True, which will be in most cases seeing how you wrote it, it jumps to the next, and therefore just out of you loop. You should write it like I did. You need to look for that match, and not for non matching values. – Luuklag Jul 31 '15 at 14:36
  • I have edited my current attempt to what I think follows your logic more.. but the same jump from For row keeps happening, it will not enter that If wbkA.Cells line @Luuklag – user4888 Aug 03 '15 at 13:58
  • @user4888 what happens if you remove the With WB part? – Luuklag Aug 03 '15 at 14:01
0

Your question is a bit long, but I think you might use the function GetInfoFromClosedFile() described here on SO.

Community
  • 1
  • 1
iDevlop
  • 24,841
  • 11
  • 90
  • 149
  • I apologize for the long question. I tried to organize it so the first two paragraphs were the pertinent information and the rest was just my code IF you needed it to go further. I appreciate the link as it definitely helps to outline what I am doing; however, my problem is not reading the information it is the actual comparisson across workbooks that I do not know how to do. With each opening file, I am looking for a new name so it is not as simple as looking for the same string every time. That is the part I am confused about. Does that make sense? – user4888 Jul 31 '15 at 12:51