I have started to use the fso object in order to overcome the 2GB limit of VBA. Everything looks satisfactory for my purposes, except that I can not find a way to go backwards in the textstream files. For going forward I have used read(no of chars) and skip(no of chars). Is there a way to do so ?
Asked
Active
Viewed 2,502 times
3
-
FSO is a forward only. The best thing that you can do is to store the data in array. – Pankaj Jaju Jan 02 '14 at 17:27
-
@PankajJaju. There is no way to store some GB's of data in arrays. We are speaking for millons of strings of at least 220 chars in length. – Demetres Jan 02 '14 at 17:30
-
To the best of my knowledge, FSO would not allow you to `SEEK` records backwards. As soon as the line is gone, its gone for good. You would have to reopen the file to go back (restart reading the file) unless you store the required data somewhere else. Array was just a suggestion. – Pankaj Jaju Jan 02 '14 at 17:36
2 Answers
2
I ran into the same frustrating limitation. Here is a class that wraps the native Windows API to perform File IO. As noted, it is based on the example on msdn at http://support.microsoft.com/kb/189981. I haven't finished testing it thoroughly, so if you find any issues, let me know so I can fix them for both our benefit. As a side note, the CanRead, CanWrite stuff is there so I can eventually implement a stream interface, but that's a future project.
Option Compare Database
Option Explicit
'Based on the example on msdn:
'http://support.microsoft.com/kb/189981
'Some of the constants come from Winnt.h
Public Enum FileAccess
' FILE_READ_DATA = &H1 ' winnt.h:1801
' 'FILE_LIST_DIRECTORY = &H1 ' winnt.h:1802
' FILE_WRITE_DATA = &H2 ' winnt.h:1804
' 'FILE_ADD_FILE = &H2 ' winnt.h:1805
' FILE_APPEND_DATA = &H4 ' winnt.h:1807
' 'FILE_ADD_SUBDIRECTORY = &H4 ' winnt.h:1808
' 'FILE_CREATE_PIPE_INSTANCE = &H4 ' winnt.h:1809
' FILE_READ_EA = &H8 ' winnt.h:1811
' FILE_READ_PROPERTIES = &H8 ' winnt.h:1812
' FILE_WRITE_EA = &H10 ' winnt.h:1814
' FILE_WRITE_PROPERTIES = &H10 ' winnt.h:1815
' FILE_EXECUTE = &H20 ' winnt.h:1817
' 'FILE_TRAVERSE = &H20 ' winnt.h:1818
' 'FILE_DELETE_CHILD = &H40 ' winnt.h:1820
' FILE_READ_ATTRIBUTES = &H80 ' winnt.h:1822
' FILE_WRITE_ATTRIBUTES = &H100 ' winnt.h:1824
FILE_ALL_ACCESS = &H1F01FF ' winnt.h:1826
FILE_GENERIC_READ = &H120089 ' winnt.h:1828
FILE_GENERIC_WRITE = &H120116 ' winnt.h:1835
' FILE_GENERIC_EXECUTE = &H1200A0 ' winnt.h:1843
' FILE_SHARE_READ = &H1 ' winnt.h:1848
' FILE_SHARE_WRITE = &H2 ' winnt.h:1849
' FILE_NOTIFY_CHANGE_FILE_NAME = &H1 ' winnt.h:1860
' FILE_NOTIFY_CHANGE_DIR_NAME = &H2 ' winnt.h:1861
' FILE_NOTIFY_CHANGE_ATTRIBUTES = &H4 ' winnt.h:1862
' FILE_NOTIFY_CHANGE_SIZE = &H8 ' winnt.h:1863
' FILE_NOTIFY_CHANGE_LAST_WRITE = &H10 ' winnt.h:1864
' FILE_NOTIFY_CHANGE_SECURITY = &H100 ' winnt.h:1865
' 'MAILSLOT_NO_MESSAGE = -1 ' winnt.h:1866
' 'MAILSLOT_WAIT_FOREVER = -1 ' winnt.h:1867
' FILE_CASE_SENSITIVE_SEARCH = &H1 ' winnt.h:1868
' FILE_CASE_PRESERVED_NAMES = &H2 ' winnt.h:1869
' FILE_UNICODE_ON_DISK = &H4 ' winnt.h:1870
' FILE_PERSISTENT_ACLS = &H8 ' winnt.h:1871
' FILE_FILE_COMPRESSION = &H10 ' winnt.h:1872
' FILE_VOLUME_IS_COMPRESSED = &H8000 ' winnt.h:1873
' IO_COMPLETION_MODIFY_STATE = &H2 ' winnt.h:1874
' IO_COMPLETION_ALL_ACCESS = &H1F0003 ' winnt.h:1875
' DUPLICATE_CLOSE_SOURCE = &H1 ' winnt.h:1876
' DUPLICATE_SAME_ACCESS = &H2 ' winnt.h:1877
' DELETE = &H10000 ' winnt.h:1935
' READ_CONTROL = &H20000 ' winnt.h:1936
' WRITE_DAC = &H40000 ' winnt.h:1937
' WRITE_OWNER = &H80000 ' winnt.h:1938
' SYNCHRONIZE = &H100000 ' winnt.h:1939
' STANDARD_RIGHTS_REQUIRED = &HF0000 ' winnt.h:1941
' STANDARD_RIGHTS_READ = &H20000 ' winnt.h:1943
' STANDARD_RIGHTS_WRITE = &H20000 ' winnt.h:1944
' STANDARD_RIGHTS_EXECUTE = &H20000 ' winnt.h:1945
' STANDARD_RIGHTS_ALL = &H1F0000 ' winnt.h:1947
' SPECIFIC_RIGHTS_ALL = &HFFFF ' winnt.h:1949
' ACCESS_SYSTEM_SECURITY = &H1000000
End Enum
Public Enum FileShare
NONE = &H0
FILE_SHARE_DELETE = &H4
FILE_SHARE_READ = &H1
FILE_SHARE_WRITE = &H2
End Enum
Public Enum FileCreationDisposition
CREATE_ALWAYS = &H2
CREATE_NEW = &H1
OPEN_ALWAYS = &H4
OPEN_EXISTING = &H3
TRUNCATE_EXISTING = &H5
End Enum
'Public Enum FileFlagsAndAttributes
' 'Attributes
' FILE_ATTRIBUTE_ENCRYPTED = &H4000
' FILE_ATTRIBUTE_READONLY = &H1 ' winnt.h:1850
' FILE_ATTRIBUTE_HIDDEN = &H2 ' winnt.h:1851
' FILE_ATTRIBUTE_SYSTEM = &H4 ' winnt.h:1852
' FILE_ATTRIBUTE_DIRECTORY = &H10 ' winnt.h:1853
' FILE_ATTRIBUTE_ARCHIVE = &H20 ' winnt.h:1854
' FILE_ATTRIBUTE_NORMAL = &H80 ' winnt.h:1855
' FILE_ATTRIBUTE_TEMPORARY = &H100 ' winnt.h:1856
' FILE_ATTRIBUTE_ATOMIC_WRITE = &H200 ' winnt.h:1857
' FILE_ATTRIBUTE_XACTION_WRITE = &H400 ' winnt.h:1858
' FILE_ATTRIBUTE_COMPRESSED = &H800 ' winnt.h:1859
' 'Flags
' FILE_FLAG_BACKUP_SEMANTICS = &H2000000
' FILE_FLAG_DELETE_ON_CLOSE = &H4000000
' FILE_FLAG_NO_BUFFERING = &H20000000
' FILE_FLAG_OPEN_NO_RECALL = &H100000
' FILE_FLAG_OPEN_REPARSE_POINT = &H200000
' FILE_FLAG_OVERLAPPED = &H40000000
' FILE_FLAG_POSIX_SEMANTICS = &H100000
'End Enum
Private Const INVALID_FILE_HANDLE = -1 '&HFFFFFFFF
Private Const FORMAT_MESSAGE_FROM_SYSTEM = &H1000
Private Const INVALID_FILE_SIZE As Long = -1 '&HFFFFFFFF
Private Const INVALID_SET_FILE_POINTER As Long = -1 '&HFFFFFFFF
Private Declare Function FormatMessage Lib "Kernel32" Alias "FormatMessageA" (ByVal dwFlags As Long, _
lpSource As Long, _
ByVal dwMessageId As Long, _
ByVal dwLanguageId As Long, _
ByVal lpBuffer As String, _
ByVal nSize As Long, _
Arguments As Any) As Long
Private Declare Function CreateFile Lib "Kernel32" Alias "CreateFileA" (ByVal lpFileName As String, _
ByVal dwDesiredAccess As Long, _
ByVal dwShareMode As Long, _
lpSecurityAttributes As Long, _
ByVal dwCreationDisposition As Long, _
ByVal dwFlagsAndAttributes As Long, _
hTemplateFile As Long) As Long
Private Declare Function SetFilePointer Lib "Kernel32" (ByVal hFile As Long, _
ByVal lDistanceToMove As Long, _
lpDistanceToMoveHigh As Long, _
ByVal dwMoveMethod As Long) As Long
Private Declare Function ReadFile Lib "Kernel32" (ByVal hFile As Long, _
lpBuffer As Any, _
ByVal nNumberOfBytesToRead As Long, _
lpNumberOfBytesRead As Long, _
ByVal lpOverlapped As Long) As Long
Private Declare Function WriteFile Lib "Kernel32" (ByVal hFile As Long, _
lpBuffer As Any, _
ByVal nNumberOfBytesToWrite As Long, _
lpNumberOfBytesWritten As Long, _
ByVal lpOverlapped As Long) As Long
Private Declare Function FlushFileBuffers Lib "Kernel32" (ByVal hFile As Long) As Long
Private Declare Function GetFileSize Lib "Kernel32" (ByVal hFile As Long, _
lpFileSizeHigh As Long) As Long
Private Declare Function CloseHandle Lib "Kernel32" (ByVal hObject As Long) As Long
Private m_Handle As Long
Private Sub Class_Terminate()
If Not m_Handle = 0 Then
Flush
CloseFile
End If
End Sub
Public Sub OpenFile(path As String, Optional access As FileAccess = FileAccess.FILE_GENERIC_READ, Optional share As FileShare = FileShare.NONE, Optional CreationDisposition As FileCreationDisposition = FileCreationDisposition.OPEN_ALWAYS)
Dim Ret As Long
Ret = CreateFile(path, access, share, ByVal 0&, CreationDisposition, 0&, ByVal 0&)
If Ret = INVALID_FILE_HANDLE Then
Err.Raise vbObjectError + Err.LastDllError, "clsFile.OpenFile", DecodeAPIErrors(Err.LastDllError)
Else
m_Handle = Ret
End If
End Sub
'Properties
Public Property Get Length() As Double
Dim Ret As Currency
Dim FileSizeHigh As Long
Ret = GetFileSize(m_Handle, FileSizeHigh)
If Not Ret = INVALID_FILE_SIZE Then
Length = Ret
Else
Err.Raise vbObjectError + Err.LastDllError, "clsFile.Length", DecodeAPIErrors(Err.LastDllError)
End If
End Property
Public Property Get Position() As Long
Dim Ret As Long
Dim DistanceToMoveHigh As Long
Ret = SetFilePointer(m_Handle, 0&, DistanceToMoveHigh, 1&) '1 is FILE_CURRENT
If DistanceToMoveHigh = 0 Then
If Ret = -1 Then
Position = -1 'EOF'
Else
Position = Ret
End If
Else
Class_Terminate
Err.Raise vbObjectError + Err.LastDllError, "clsFile.Position", DecodeAPIErrors(Err.LastDllError)
End If
End Property
Public Property Get Handle() As Long
Handle = m_Handle
End Property
'Functions
Public Function ReadBytes(ByRef buffer() As Byte, ByVal buffer_offset As Long, ByVal count As Long) As Long
Dim Ret As Long
Dim BytesRead As Long
Ret = ReadFile(m_Handle, buffer(buffer_offset), count, BytesRead, 0&)
If Ret = 1 Then
ReadBytes = BytesRead
Else
Class_Terminate
Err.Raise vbObjectError + Err.LastDllError, "clsFile.ReadBytes", DecodeAPIErrors(Err.LastDllError)
End If
End Function
Public Function ReadBytesPtr(ByVal ptrBuf As Long, ByVal buffer_offset As Long, ByVal count As Long) As Long
Dim Ret As Long
Dim BytesRead As Long
Ret = ReadFile(m_Handle, ByVal ptrBuf + buffer_offset, count, BytesRead, 0&)
If Ret = 1 Then
ReadBytesPtr = BytesRead
Else
Class_Terminate
Err.Raise vbObjectError + Err.LastDllError, "clsFileStream.ReadBytesPtr", DecodeAPIErrors(Err.LastDllError)
End If
End Function
Public Function WriteBytes(ByRef buffer() As Byte, ByVal buffer_offset As Long, ByVal count As Long) As Long
Dim Ret As Long
Dim BytesWritten As Long
Ret = WriteFile(m_Handle, buffer(buffer_offset), count, BytesWritten, 0&)
If Ret = 1 Then
WriteBytes = BytesWritten
Else
Class_Terminate
Err.Raise vbObjectError + Err.LastDllError, "clsFile.WriteBytes", DecodeAPIErrors(Err.LastDllError)
End If
End Function
Public Function WriteBytesPtr(ByVal ptrBuf As Long, ByVal buffer_offset As Long, ByVal count As Long) As Long
Dim Ret As Long
Dim BytesWritten As Long
Ret = WriteFile(m_Handle, ByVal ptrBuf + buffer_offset, count, BytesWritten, 0&)
If Ret = 1 Then
WriteBytesPtr = BytesWritten
Else
Class_Terminate
Err.Raise vbObjectError + Err.LastDllError, "clsFile.WriteBytes", DecodeAPIErrors(Err.LastDllError)
End If
End Function
Public Function SeekFile(ByVal LoBytesOffset As Long, origin As SeekOrigin) As Long
Dim Ret As Long
Dim HiBytesOffset As Long
Ret = SetFilePointer(m_Handle, LoBytesOffset, HiBytesOffset, origin)
If Not Ret = INVALID_SET_FILE_POINTER Then
SeekFile = Ret
Else
Err.Raise vbObjectError + Err.LastDllError, "clsFile.SeekFile", DecodeAPIErrors(Err.LastDllError)
End If
End Function
Public Function SeekFile64bit(ByVal offset As Currency, origin As SeekOrigin) As Currency
'Take care with this function. A Currency variable is an 8-byte (64-bit) scaled (by 10,000) fixed-point number.'
'This means that setting a Currency variable to 0.0001 is the equivalent of a binary value of 1.'
'If you want to set an offset with an immediate value, write it like so:'
'1073741824 Bytes (1 GB) would be 107374.1824@, where @ is the symbol for an immediate Currency value.'
'Refer to http://support.microsoft.com/kb/189862 for hints on how to do 64-bit arithmetic'
Dim Ret As Long
Dim curFilePosition As Currency
Dim LoBytesOffset As Long, HiBytesOffset As Long
CopyMemory VarPtr(HiBytesOffset), VarPtr(offset) + 4, 4
CopyMemory VarPtr(LoBytesOffset), VarPtr(offset), 4
Ret = SetFilePointer(m_Handle, LoBytesOffset, HiBytesOffset, origin)
CopyMemory VarPtr(curFilePosition) + 4, VarPtr(HiBytesOffset), 4
CopyMemory VarPtr(curFilePosition), VarPtr(Ret), 4
SeekFileCurrency = curFilePosition
End Function
Public Sub CloseFile()
Dim Ret As Long
Ret = CloseHandle(m_Handle)
m_Handle = 0
End Sub
Public Sub Flush()
Dim Ret As Long
Ret = FlushFileBuffers(m_Handle)
End Sub
'***********************************************************************************
' Helper function, from Microsoft page as noted at top
Private Function DecodeAPIErrors(ByVal ErrorCode As Long) As String
Dim sMessage As String, MessageLength As Long
sMessage = Space$(256)
MessageLength = FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, 0&, _
ErrorCode, 0&, sMessage, 256&, 0&)
If MessageLength > 0 Then
DecodeAPIErrors = Left(sMessage, MessageLength)
Else
DecodeAPIErrors = "Unknown Error."
End If
End Function
And here's an example of how to use it:
Public Sub Main()
Dim oFile As clsFile
Set oFile = New clsFile
oFile.OpenFile "C:\YourFilePathHere", FILE_GENERIC_READ, NONE, OPEN_EXISTING
Dim ChunkOfData() As Byte
Const CHUNKSIZE As Long = 4096
ReDim ChunkOfData(0 To CHUNKSIZE - 1)
Dim lngCurrChunk As Long
Dim lngBytesRead As Double
'The SeekFile function works for seeks forward or backward in the file from [-2GB to +2GB).'
'Past that you can use the SeekFile64bit function, but you'll have to be aware of the issues with using Currency to store the 64-bit number'
Debug.Print oFile.SeekFile(&H40000000, so_Current) 'A 1GB seek
lngBytesRead = oFile.ReadBytes(ChunkOfData, 0, CHUNKSIZE)
While lngBytesRead > 0 'As soon as a call to ReadBytes returns 0, we've reached the end of the file.
'Do something with the 4k chunk of data. The buffer gets reused in this example.
'Debug.Print ChunkOfData
lngCurrChunk = lngCurrChunk + 1
lngBytesRead = oFile.ReadBytes(ChunkOfData, 0, CHUNKSIZE)
Wend
MsgBox "Complete!"
End Sub

Blackhawk
- 5,984
- 4
- 27
- 56
-
can you supply with an example of the usage of the above ? e.g. Open, position, read/write – Demetres Jan 02 '14 at 19:34
-
@Demetres I'm cleaning it up and adding some examples. Question though, how big of a file are we talking? The SetFilePointer function (lets you move within the file) can accept two 32-bit parameters (high and lo) that together represent a 64-bit address, but that complicates the class a bit since I would have to accept a double or something and then convert to two Longs. 32-bits can address ~4GB of space (2^32 bytes) - is that enough? By comparison, 64-bits can address ~16 Exabytes (2^64 bytes). – Blackhawk Jan 02 '14 at 21:10
-
@Demetres In fact, if you could provide more details on what you are trying to do (large text file? Will you be parsing it? Binary data? Fixed-length records or delimited?) I can make a clearer example. – Blackhawk Jan 02 '14 at 21:23
-
Really many thanks for trying to help. The file can be more than 4GB (9,6 GB). It consists of 0-9, A-F chars and I would like to search it. I have implemented it right now as fixed-length "random" file and I do binary search on it. In order to have it sorted, I do convert it to a delimited one, sort it with windows sort.exe and back again to fixed-length. – Demetres Jan 02 '14 at 21:39
-
@Demetres I tested the code reading through a 12GB file, so it's working fine. There are a couple things you need to be aware of. Seeking is done with either the SeekFile or SeekFile64bit function. In both cases, the number of bytes is signed 2's complement. The first is 32bit with a limit of [-2GB, +2GB) and the 64bit has a limit of something quite rediculous you'll never encounter. For the second, the 64bit value is passed as a Currency, which is a scaled fixed-point number. If you want to display it, you can convert it to a string like so: `Replace(Format(currency, "0.000"), ".", "")` – Blackhawk Jan 03 '14 at 17:05
-
-
@PankajJaju Yeah, win32 api stuff always is :) If you are actually using this for a project, I can add a "ReadLine" function and an "EOF" property in there if that would be useful. For my purposes I did not because I'm working with files large enough that reading line by line would be too slow. – Blackhawk Jan 10 '14 at 17:52