I'd like to get fresh advice on the infamous issue caused by the Byte Order Mark ( or EF BB BF in hex) when trying to read UTF-8 encoded CSVs with VBA (Excel). Note that I'd like to avoid opening the CSV with Workbooks.Open or FileSystemObject. Actually, I'd rather use an adodb.RecordSet as I need to execute some kind of SQL queries.
After having read many (many!) things, I believe the 4 best solutions to deal with this specific issue are:
- Removing the BOM before reading the CSV with ADODB.Connection / ADODB.RecordSet (for instance, via #iFile or Scripting.FileSystemObject-OpenAsTextStream to efficiently read the first line of the file and remove the BOM).
- Creating a schema.ini file so that ADO properly parses the CSV.
- Using some modules created by wizards (like W. Garcia's class module).
- Using an ADODB.Stream and setting Charset = "UTF-8".
The last solution (using a stream) seems quite fine but doing the following returns a string:
Sub loadCsv()
Const adModeReadWrite As Integer = 3
With CreateObject("ADODB.Stream")
.Charset = "utf-8"
.Mode = adModeReadWrite
.Open
.LoadFromFile ("C:\atestpath\test.csv")
Debug.Print .readtext
End With
End Sub
Do you know any trick that could help using the string returned by .readtext as the Data Source of an ADODB.RecordSet or ADODB.Connection (apart from looping to manually populate the fields of my recordset)?