1

I'm trying to read the first few characters in large (>15MB) files in excel. Right now, I'm using the typical:

Set MyObject = New Scripting.FileSystemObject
Set mySource = MyObject.GetFolder(mySourcePath)
For Each myFile In mySource.Files
    With New Scripting.FileSystemObject
        With .OpenTextFile(myFile, ForReading)
            test_str = .ReadLine
            'Do things
        End With
    End With
Next

The issue is with large files, I (believe) you're loading into memory the WHOLE thing only to read the first few characters. Is there a way to just extract the first 6 characters?

user2565953
  • 13
  • 1
  • 3
  • 2
    No, it does not load the whole thing. I've just `ReadLine`d an HD movie and Excel memory consumption / IO counts were thery low. – GSerg Sep 03 '13 at 14:18
  • Why do you think that it is loading the whole file into memory? – RBarryYoung Sep 03 '13 at 14:18
  • My issue is that it will loop through the smaller files very quickly, it gets stuck on the larger ones (or as time goes on, perhaps). BUT knowing this now, perhaps this isn't the issue. I'm also using: "If Right(myFile.Name, 3) = "txt" Then" On my directory search to make sure I am dealing with the files I need (and not the pictures). These files are on a network. If I terminate excel before it closes I get the error: "Method 'Name' of object 'IFile' failed" I'll investigate more and report back... – user2565953 Sep 03 '13 at 15:20

1 Answers1

1

An alternative to the FileSystemObject would be ADO

However, your statement

I (believe) you're loading into memory the WHOLE thing only to read the first few characters.

is wrong.

What I think is misleading you is the fact that you are not exiting the loop after you read the first line. You get what you want by reading line by line but you are not closing the file right away. It's a good programmers practice to always close any objects you initiate in your code. Don't just leave it hanging and don't rely on the environment to kill them.

Consider the below code as an alternative to yours and see if there is any efficiency difference

Option Explicit

' add references to Microsoft Scripting Runtime
' Tools >> References >> Microsoft Scripting Runtime
Sub Main()

    Dim fileName As String
    ' make sure to update your path
    fileName = "C:\Users\FoohBooh\Desktop\Project.txt"

    ReadTxtFile fileName


End Sub

Sub ReadTxtFile(fileName)

    Dim oFSO As New FileSystemObject
    Dim oFS As TextStream

    Set oFS = oFSO.OpenTextFile(fileName)

    Dim content As String
    content = oFS.ReadLine

    With Sheets(1).Range("A1")
        .ClearContents
        .NumberFormat = "@"
        .Value = content
    End With

    oFS.Close
    Set oFS = Nothing

End Sub

The above code reads the first line of a .txt file into cell A1 of the first sheet. Remember to set a fileName variable to a full path.

  • `Don't just leave it hanging and don't rely on the environment to *kill* them` - I used to think this way, too. Then I realized it was a) stupid and b) too tedious to get right. E.g. the code you show is no better than the OP's code, because if an exception happens between `Set oFS = ...` and `Set oFS = Nothing`, your cleanup code will not run and you will thus leave the object for the environment to *kill*. VBA has deterministic reference-counting-based destruction and you should rely on it because it's deterministic. Adding explicit `.Close`s only provides false sense of more correct code. – GSerg Sep 03 '13 at 21:29
  • @GSerg obviously it's a demonstration. If I were to go over error handling I might as well just write a book. I am glad you've added your comment - I agree, it's helpful, although error handling is another wide topic we could write about for hours - specially considering the fact that VBA is not great at it comparing to other languages, no try{}, catch{}, finally{} blocks exist in VBA –  Sep 05 '13 at 07:09