3

I have started to use the fso object in order to overcome the 2GB limit of VBA. Everything looks satisfactory for my purposes, except that I can not find a way to go backwards in the textstream files. For going forward I have used read(no of chars) and skip(no of chars). Is there a way to do so ?

Demetres
  • 127
  • 2
  • 10
  • FSO is a forward only. The best thing that you can do is to store the data in array. – Pankaj Jaju Jan 02 '14 at 17:27
  • @PankajJaju. There is no way to store some GB's of data in arrays. We are speaking for millons of strings of at least 220 chars in length. – Demetres Jan 02 '14 at 17:30
  • To the best of my knowledge, FSO would not allow you to `SEEK` records backwards. As soon as the line is gone, its gone for good. You would have to reopen the file to go back (restart reading the file) unless you store the required data somewhere else. Array was just a suggestion. – Pankaj Jaju Jan 02 '14 at 17:36

2 Answers2

2

I ran into the same frustrating limitation. Here is a class that wraps the native Windows API to perform File IO. As noted, it is based on the example on msdn at http://support.microsoft.com/kb/189981. I haven't finished testing it thoroughly, so if you find any issues, let me know so I can fix them for both our benefit. As a side note, the CanRead, CanWrite stuff is there so I can eventually implement a stream interface, but that's a future project.

Option Compare Database
Option Explicit

'Based on the example on msdn:
'http://support.microsoft.com/kb/189981

'Some of the constants come from Winnt.h

Public Enum FileAccess
'    FILE_READ_DATA = &H1                     ' winnt.h:1801
'    'FILE_LIST_DIRECTORY = &H1                ' winnt.h:1802
'    FILE_WRITE_DATA = &H2                    ' winnt.h:1804
'    'FILE_ADD_FILE = &H2                      ' winnt.h:1805
'    FILE_APPEND_DATA = &H4                   ' winnt.h:1807
'    'FILE_ADD_SUBDIRECTORY = &H4              ' winnt.h:1808
'    'FILE_CREATE_PIPE_INSTANCE = &H4          ' winnt.h:1809
'    FILE_READ_EA = &H8                       ' winnt.h:1811
'    FILE_READ_PROPERTIES = &H8               ' winnt.h:1812
'    FILE_WRITE_EA = &H10                     ' winnt.h:1814
'    FILE_WRITE_PROPERTIES = &H10             ' winnt.h:1815
'    FILE_EXECUTE = &H20                      ' winnt.h:1817
'    'FILE_TRAVERSE = &H20                     ' winnt.h:1818
'    'FILE_DELETE_CHILD = &H40                 ' winnt.h:1820
'    FILE_READ_ATTRIBUTES = &H80              ' winnt.h:1822
'    FILE_WRITE_ATTRIBUTES = &H100            ' winnt.h:1824
    FILE_ALL_ACCESS = &H1F01FF               ' winnt.h:1826
    FILE_GENERIC_READ = &H120089             ' winnt.h:1828
    FILE_GENERIC_WRITE = &H120116            ' winnt.h:1835
'    FILE_GENERIC_EXECUTE = &H1200A0          ' winnt.h:1843
'    FILE_SHARE_READ = &H1                    ' winnt.h:1848
'    FILE_SHARE_WRITE = &H2                   ' winnt.h:1849
'    FILE_NOTIFY_CHANGE_FILE_NAME = &H1       ' winnt.h:1860
'    FILE_NOTIFY_CHANGE_DIR_NAME = &H2        ' winnt.h:1861
'    FILE_NOTIFY_CHANGE_ATTRIBUTES = &H4      ' winnt.h:1862
'    FILE_NOTIFY_CHANGE_SIZE = &H8            ' winnt.h:1863
'    FILE_NOTIFY_CHANGE_LAST_WRITE = &H10     ' winnt.h:1864
'    FILE_NOTIFY_CHANGE_SECURITY = &H100      ' winnt.h:1865
'    'MAILSLOT_NO_MESSAGE = -1                ' winnt.h:1866
'    'MAILSLOT_WAIT_FOREVER = -1              ' winnt.h:1867
'    FILE_CASE_SENSITIVE_SEARCH = &H1         ' winnt.h:1868
'    FILE_CASE_PRESERVED_NAMES = &H2          ' winnt.h:1869
'    FILE_UNICODE_ON_DISK = &H4               ' winnt.h:1870
'    FILE_PERSISTENT_ACLS = &H8               ' winnt.h:1871
'    FILE_FILE_COMPRESSION = &H10             ' winnt.h:1872
'    FILE_VOLUME_IS_COMPRESSED = &H8000       ' winnt.h:1873
'    IO_COMPLETION_MODIFY_STATE = &H2         ' winnt.h:1874
'    IO_COMPLETION_ALL_ACCESS = &H1F0003      ' winnt.h:1875
'    DUPLICATE_CLOSE_SOURCE = &H1             ' winnt.h:1876
'    DUPLICATE_SAME_ACCESS = &H2              ' winnt.h:1877
'    DELETE = &H10000                         ' winnt.h:1935
'    READ_CONTROL = &H20000                   ' winnt.h:1936
'    WRITE_DAC = &H40000                      ' winnt.h:1937
'    WRITE_OWNER = &H80000                    ' winnt.h:1938
'    SYNCHRONIZE = &H100000                   ' winnt.h:1939
'    STANDARD_RIGHTS_REQUIRED = &HF0000       ' winnt.h:1941
'    STANDARD_RIGHTS_READ = &H20000           ' winnt.h:1943
'    STANDARD_RIGHTS_WRITE = &H20000          ' winnt.h:1944
'    STANDARD_RIGHTS_EXECUTE = &H20000        ' winnt.h:1945
'    STANDARD_RIGHTS_ALL = &H1F0000           ' winnt.h:1947
'    SPECIFIC_RIGHTS_ALL = &HFFFF             ' winnt.h:1949
'    ACCESS_SYSTEM_SECURITY = &H1000000
End Enum


Public Enum FileShare
    NONE = &H0
    FILE_SHARE_DELETE = &H4
    FILE_SHARE_READ = &H1
    FILE_SHARE_WRITE = &H2
End Enum


Public Enum FileCreationDisposition
    CREATE_ALWAYS = &H2
    CREATE_NEW = &H1
    OPEN_ALWAYS = &H4
    OPEN_EXISTING = &H3
    TRUNCATE_EXISTING = &H5
End Enum


'Public Enum FileFlagsAndAttributes
'    'Attributes
'    FILE_ATTRIBUTE_ENCRYPTED = &H4000
'    FILE_ATTRIBUTE_READONLY = &H1            ' winnt.h:1850
'    FILE_ATTRIBUTE_HIDDEN = &H2              ' winnt.h:1851
'    FILE_ATTRIBUTE_SYSTEM = &H4              ' winnt.h:1852
'    FILE_ATTRIBUTE_DIRECTORY = &H10          ' winnt.h:1853
'    FILE_ATTRIBUTE_ARCHIVE = &H20            ' winnt.h:1854
'    FILE_ATTRIBUTE_NORMAL = &H80             ' winnt.h:1855
'    FILE_ATTRIBUTE_TEMPORARY = &H100         ' winnt.h:1856
'    FILE_ATTRIBUTE_ATOMIC_WRITE = &H200      ' winnt.h:1857
'    FILE_ATTRIBUTE_XACTION_WRITE = &H400     ' winnt.h:1858
'    FILE_ATTRIBUTE_COMPRESSED = &H800        ' winnt.h:1859
'    'Flags
'    FILE_FLAG_BACKUP_SEMANTICS = &H2000000
'    FILE_FLAG_DELETE_ON_CLOSE = &H4000000
'    FILE_FLAG_NO_BUFFERING = &H20000000
'    FILE_FLAG_OPEN_NO_RECALL = &H100000
'    FILE_FLAG_OPEN_REPARSE_POINT = &H200000
'    FILE_FLAG_OVERLAPPED = &H40000000
'    FILE_FLAG_POSIX_SEMANTICS = &H100000
'End Enum


Private Const INVALID_FILE_HANDLE = -1 '&HFFFFFFFF
Private Const FORMAT_MESSAGE_FROM_SYSTEM = &H1000
Private Const INVALID_FILE_SIZE As Long = -1 '&HFFFFFFFF
Private Const INVALID_SET_FILE_POINTER As Long = -1 '&HFFFFFFFF


Private Declare Function FormatMessage Lib "Kernel32" Alias "FormatMessageA" (ByVal dwFlags As Long, _
                                                                              lpSource As Long, _
                                                                              ByVal dwMessageId As Long, _
                                                                              ByVal dwLanguageId As Long, _
                                                                              ByVal lpBuffer As String, _
                                                                              ByVal nSize As Long, _
                                                                              Arguments As Any) As Long


Private Declare Function CreateFile Lib "Kernel32" Alias "CreateFileA" (ByVal lpFileName As String, _
                                                                        ByVal dwDesiredAccess As Long, _
                                                                        ByVal dwShareMode As Long, _
                                                                        lpSecurityAttributes As Long, _
                                                                        ByVal dwCreationDisposition As Long, _
                                                                        ByVal dwFlagsAndAttributes As Long, _
                                                                        hTemplateFile As Long) As Long


Private Declare Function SetFilePointer Lib "Kernel32" (ByVal hFile As Long, _
                                               ByVal lDistanceToMove As Long, _
                                               lpDistanceToMoveHigh As Long, _
                                               ByVal dwMoveMethod As Long) As Long


Private Declare Function ReadFile Lib "Kernel32" (ByVal hFile As Long, _
                                                  lpBuffer As Any, _
                                                  ByVal nNumberOfBytesToRead As Long, _
                                                  lpNumberOfBytesRead As Long, _
                                                  ByVal lpOverlapped As Long) As Long


Private Declare Function WriteFile Lib "Kernel32" (ByVal hFile As Long, _
                                                   lpBuffer As Any, _
                                                   ByVal nNumberOfBytesToWrite As Long, _
                                                   lpNumberOfBytesWritten As Long, _
                                                   ByVal lpOverlapped As Long) As Long


Private Declare Function FlushFileBuffers Lib "Kernel32" (ByVal hFile As Long) As Long


Private Declare Function GetFileSize Lib "Kernel32" (ByVal hFile As Long, _
                                                     lpFileSizeHigh As Long) As Long


Private Declare Function CloseHandle Lib "Kernel32" (ByVal hObject As Long) As Long

Private m_Handle As Long

Private Sub Class_Terminate()
    If Not m_Handle = 0 Then
        Flush
        CloseFile
    End If
End Sub

Public Sub OpenFile(path As String, Optional access As FileAccess = FileAccess.FILE_GENERIC_READ, Optional share As FileShare = FileShare.NONE, Optional CreationDisposition As FileCreationDisposition = FileCreationDisposition.OPEN_ALWAYS)
    Dim Ret As Long
    Ret = CreateFile(path, access, share, ByVal 0&, CreationDisposition, 0&, ByVal 0&)
    If Ret = INVALID_FILE_HANDLE Then
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.OpenFile", DecodeAPIErrors(Err.LastDllError)
    Else
        m_Handle = Ret
    End If
End Sub

'Properties

Public Property Get Length() As Double
    Dim Ret As Currency
    Dim FileSizeHigh As Long
    Ret = GetFileSize(m_Handle, FileSizeHigh)
    If Not Ret = INVALID_FILE_SIZE Then
        Length = Ret
    Else
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.Length", DecodeAPIErrors(Err.LastDllError)
    End If
End Property

Public Property Get Position() As Long
    Dim Ret As Long
    Dim DistanceToMoveHigh As Long
    Ret = SetFilePointer(m_Handle, 0&, DistanceToMoveHigh, 1&) '1 is FILE_CURRENT
    If DistanceToMoveHigh = 0 Then
        If Ret = -1 Then
            Position = -1 'EOF'
        Else
            Position = Ret
        End If
    Else
        Class_Terminate
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.Position", DecodeAPIErrors(Err.LastDllError)
    End If
End Property

Public Property Get Handle() As Long
    Handle = m_Handle
End Property

'Functions

Public Function ReadBytes(ByRef buffer() As Byte, ByVal buffer_offset As Long, ByVal count As Long) As Long
    Dim Ret As Long
    Dim BytesRead As Long
    Ret = ReadFile(m_Handle, buffer(buffer_offset), count, BytesRead, 0&)
    If Ret = 1 Then
        ReadBytes = BytesRead
    Else
        Class_Terminate
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.ReadBytes", DecodeAPIErrors(Err.LastDllError)
    End If
End Function

Public Function ReadBytesPtr(ByVal ptrBuf As Long, ByVal buffer_offset As Long, ByVal count As Long) As Long
    Dim Ret As Long
    Dim BytesRead As Long
    Ret = ReadFile(m_Handle, ByVal ptrBuf + buffer_offset, count, BytesRead, 0&)
    If Ret = 1 Then
        ReadBytesPtr = BytesRead
    Else
        Class_Terminate
        Err.Raise vbObjectError + Err.LastDllError, "clsFileStream.ReadBytesPtr", DecodeAPIErrors(Err.LastDllError)
    End If
End Function

Public Function WriteBytes(ByRef buffer() As Byte, ByVal buffer_offset As Long, ByVal count As Long) As Long
    Dim Ret As Long
    Dim BytesWritten As Long
    Ret = WriteFile(m_Handle, buffer(buffer_offset), count, BytesWritten, 0&)
    If Ret = 1 Then
        WriteBytes = BytesWritten
    Else
        Class_Terminate
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.WriteBytes", DecodeAPIErrors(Err.LastDllError)
    End If
End Function

Public Function WriteBytesPtr(ByVal ptrBuf As Long, ByVal buffer_offset As Long, ByVal count As Long) As Long
    Dim Ret As Long
    Dim BytesWritten As Long
    Ret = WriteFile(m_Handle, ByVal ptrBuf + buffer_offset, count, BytesWritten, 0&)
    If Ret = 1 Then
        WriteBytesPtr = BytesWritten
    Else
        Class_Terminate
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.WriteBytes", DecodeAPIErrors(Err.LastDllError)
    End If
End Function

Public Function SeekFile(ByVal LoBytesOffset As Long, origin As SeekOrigin) As Long
    Dim Ret As Long
    Dim HiBytesOffset As Long
    Ret = SetFilePointer(m_Handle, LoBytesOffset, HiBytesOffset, origin)
    If Not Ret = INVALID_SET_FILE_POINTER Then
        SeekFile = Ret
    Else
        Err.Raise vbObjectError + Err.LastDllError, "clsFile.SeekFile", DecodeAPIErrors(Err.LastDllError)
    End If
End Function

Public Function SeekFile64bit(ByVal offset As Currency, origin As SeekOrigin) As Currency
'Take care with this function. A Currency variable is an 8-byte (64-bit) scaled (by 10,000) fixed-point number.'
'This means that setting a Currency variable to 0.0001 is the equivalent of a binary value of 1.'
'If you want to set an offset with an immediate value, write it like so:'
'1073741824 Bytes (1 GB) would be 107374.1824@, where @ is the symbol for an immediate Currency value.'
'Refer to http://support.microsoft.com/kb/189862 for hints on how to do 64-bit arithmetic'
    Dim Ret As Long
    Dim curFilePosition As Currency
    Dim LoBytesOffset As Long, HiBytesOffset As Long

    CopyMemory VarPtr(HiBytesOffset), VarPtr(offset) + 4, 4
    CopyMemory VarPtr(LoBytesOffset), VarPtr(offset), 4

    Ret = SetFilePointer(m_Handle, LoBytesOffset, HiBytesOffset, origin)

    CopyMemory VarPtr(curFilePosition) + 4, VarPtr(HiBytesOffset), 4
    CopyMemory VarPtr(curFilePosition), VarPtr(Ret), 4

    SeekFileCurrency = curFilePosition
End Function

Public Sub CloseFile()
    Dim Ret As Long
    Ret = CloseHandle(m_Handle)
    m_Handle = 0
End Sub

Public Sub Flush()
    Dim Ret As Long
    Ret = FlushFileBuffers(m_Handle)
End Sub

 '***********************************************************************************
' Helper function, from Microsoft page as noted at top
Private Function DecodeAPIErrors(ByVal ErrorCode As Long) As String
    Dim sMessage As String, MessageLength As Long
    sMessage = Space$(256)
    MessageLength = FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, 0&, _
                                  ErrorCode, 0&, sMessage, 256&, 0&)
    If MessageLength > 0 Then
        DecodeAPIErrors = Left(sMessage, MessageLength)
    Else
        DecodeAPIErrors = "Unknown Error."
    End If
End Function

And here's an example of how to use it:

Public Sub Main()
    Dim oFile As clsFile
    Set oFile = New clsFile

    oFile.OpenFile "C:\YourFilePathHere", FILE_GENERIC_READ, NONE, OPEN_EXISTING

    Dim ChunkOfData() As Byte
    Const CHUNKSIZE As Long = 4096
    ReDim ChunkOfData(0 To CHUNKSIZE - 1)

    Dim lngCurrChunk As Long
    Dim lngBytesRead As Double


    'The SeekFile function works for seeks forward or backward in the file from [-2GB to +2GB).'
    'Past that you can use the SeekFile64bit function, but you'll have to be aware of the issues with using Currency to store the 64-bit number'
    Debug.Print oFile.SeekFile(&H40000000, so_Current) 'A 1GB seek

    lngBytesRead = oFile.ReadBytes(ChunkOfData, 0, CHUNKSIZE)
    While lngBytesRead > 0 'As soon as a call to ReadBytes returns 0, we've reached the end of the file.
        'Do something with the 4k chunk of data.  The buffer gets reused in this example.
        'Debug.Print ChunkOfData
        lngCurrChunk = lngCurrChunk + 1
        lngBytesRead = oFile.ReadBytes(ChunkOfData, 0, CHUNKSIZE)
    Wend
    MsgBox "Complete!"
End Sub
Blackhawk
  • 5,984
  • 4
  • 27
  • 56
  • can you supply with an example of the usage of the above ? e.g. Open, position, read/write – Demetres Jan 02 '14 at 19:34
  • @Demetres I'm cleaning it up and adding some examples. Question though, how big of a file are we talking? The SetFilePointer function (lets you move within the file) can accept two 32-bit parameters (high and lo) that together represent a 64-bit address, but that complicates the class a bit since I would have to accept a double or something and then convert to two Longs. 32-bits can address ~4GB of space (2^32 bytes) - is that enough? By comparison, 64-bits can address ~16 Exabytes (2^64 bytes). – Blackhawk Jan 02 '14 at 21:10
  • @Demetres In fact, if you could provide more details on what you are trying to do (large text file? Will you be parsing it? Binary data? Fixed-length records or delimited?) I can make a clearer example. – Blackhawk Jan 02 '14 at 21:23
  • Really many thanks for trying to help. The file can be more than 4GB (9,6 GB). It consists of 0-9, A-F chars and I would like to search it. I have implemented it right now as fixed-length "random" file and I do binary search on it. In order to have it sorted, I do convert it to a delimited one, sort it with windows sort.exe and back again to fixed-length. – Demetres Jan 02 '14 at 21:39
  • @Demetres I tested the code reading through a 12GB file, so it's working fine. There are a couple things you need to be aware of. Seeking is done with either the SeekFile or SeekFile64bit function. In both cases, the number of bytes is signed 2's complement. The first is 32bit with a limit of [-2GB, +2GB) and the 64bit has a limit of something quite rediculous you'll never encounter. For the second, the 64bit value is passed as a Currency, which is a scaled fixed-point number. If you want to display it, you can convert it to a string like so: `Replace(Format(currency, "0.000"), ".", "")` – Blackhawk Jan 03 '14 at 17:05
  • Thank you for your valuable help. – Demetres Jan 04 '14 at 14:32
  • @PankajJaju Yeah, win32 api stuff always is :) If you are actually using this for a project, I can add a "ReadLine" function and an "EOF" property in there if that would be useful. For my purposes I did not because I'm working with files large enough that reading line by line would be too slow. – Blackhawk Jan 10 '14 at 17:52
1

Try ADODB.Stream. Here are a couple of links: MSDN and W3

Dmitry
  • 421
  • 4
  • 14