0

I have been working on a project that involves importing massive amounts of data, and the way I have been doing things is just too slow. I have two column .csv files, and I have been importing them one-by-one onto a new sheet, and then copying the cells into an array using a for loop.

I would like to directly import the .csv into an array without operating on worksheet cells (they are so slow!).

I have seen a previous solution that involves copying a .csv into a two-dimensional array, but I have been just copy and pasting their code, and then writing my own code to turn it into individual arrays. Very messy, and very difficult to debug.

Load csv file into VBA array rather than Excel sheet

Does anyone have a simpler function handy? Something of the format below:

     Dim myXArray(1 TO 10000) as Double
     Dim myYArray(1 TO 10000) as Double

    'myArray = ImportCSV(path, column)
     myXArray = ImportCSV("C:\Users\Desktop\file.csv", 1)
     myYArray = ImportCSV("C:\Users\Desktop\file.csv", 2)

My feeling is that someone has already done this, and by asking, I won't have to reinvent the wheel. It seems like it would be such a universally useful function.

Thanks for the help!

Michael

Community
  • 1
  • 1
Michael Molter
  • 1,296
  • 2
  • 14
  • 37
  • 1
    What is messy about converting to single-dim arrays? I don't think a method that reads a single column from a CSV is going to be any cleaner. It would also require you to read the file twice. Maybe you can share the code you're using to get to 1-dim arrays if you don't like that step. – Byron Wall Apr 12 '15 at 21:33

1 Answers1

1

Regarding performance remember EXCEL is a SPREADSHEET, not a database.

It allows non programmers to do programming type things. Mixing code and the excel program itself is rarely elegent.

All this is vbscript but vbscript works in VBA but can be made faster by not using createobject.

It depends on how you are processing the data. If it's linear then this is easy

Set fso = CreateObject("Scripting.FileSystemObject")
Set srcfile = fso.GetFile(objArgs(0))
If err.number = 0 then Set TS = srcFile.OpenAsTextStream(1, 0)
Src()=Split(Repace(ts.readall, VBCR, ""), VBLF)
For Each Line in Src()
     Field() = Split(Line, ",")
     msgbox Field(0) & Field(1)
Next

But that uses memory if your data is large, this only reads a line at a time.

Do Until srcfile.AtEndOfStream
    Line=srcfile.readline
     Field() = Split(Line, ",")
     msgbox Field(0) & Field(1)
Loop

More powerfull than arrays are Dictionaries and Recordsets. Recordsets can be made in memory, but as CSV are database files ADO can read them and return than as a recordset (which you can save). Here's a sample from MS's web site on how to use ADO to query and get one and another by me on how to make your own.

https://msdn.microsoft.com/en-us/library/ms974559.aspx?f=255&MSPPError=-2147217396

On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

strPathtoTextFile = "C:\Databases\"

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
          "Data Source=" & strPathtoTextFile & ";" & _
          "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT * FROM PhoneList.csv", _
          objConnection, adOpenStatic, adLockOptimistic, adCmdText

Do Until objRecordset.EOF
    Wscript.Echo "Name: " & objRecordset.Fields.Item("Name")
    Wscript.Echo "Department: " & _
        objRecordset.Fields.Item("Department")
    Wscript.Echo "Extension: " & objRecordset.Fields.Item("Extension")   
    objRecordset.MoveNext
Loop

Make your own

Sub Swap
    Dim LineCount
    Set rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append "LineNumber", 4 
        .Fields.Append "Txt", 201, 5000 
        .Open
        LineCount = 0
        Do Until Inp.AtEndOfStream
            LineCount = LineCount + 1
            .AddNew
            .Fields("LineNumber").value = LineCount
            .Fields("Txt").value = Inp.readline
            .UpDate
        Loop
        .Sort = "LineNumber DESC"
        Do While not .EOF
            Outp.writeline .Fields("Txt").Value
            .MoveNext
        Loop
    End With
End Sub
Serenity
  • 311
  • 2
  • 3
  • While I am painfully aware of the limitations of Excel VBA, I have to use it in my current application. I implemented the second solution this afternoon, and it worked perfectly. I use similar StreamReader type code in C# all the time, and was glad to realize it was just as simple in VBA. Thanks! – Michael Molter Apr 13 '15 at 18:46