1

I have an MS Access database. The database version is 2002-2003 (therefore mdb). From that database I am exporting several fields into a temporary table in order to represent them on a website. As there is a field which has several comma-separated entries, I am splitting them so that each record has only one entry per field.

Imagine a German-English dictionary with the following record:

 en | de
 building | Gebäude,Bauwerk

I want to split it as follows:

 en | de
 building | Gebäude
 building | Bauwerk

The VBA function that I am using used to work fine. The database has around 100.000 records. Splitting used to take around 30 minutes. Now it takes the whole day.

This is the function:

    Public Sub commasplitfield4()
    Dim rstObj As DAO.Recordset, dbObj As DAO.Database
    Dim InsertSQL As String
    Set dbObj = CurrentDb()
    Set rstObj = dbObj.OpenRecordset("qry-export")
    DoCmd.SetWarnings False
    Do While Not rstObj.EOF
        Dim memArr() As String
        memArr = Split(rstObj.Fields("field4"), ",")
        For i = 0 To UBound(memArr)
            InsertSQL = "INSERT INTO exporttemp(field1,field2,field3,field4) VALUES (""" & rstObj.Fields("field1") _
            & """, """ & rstObj.Fields("field2") _
            & """, """ & rstObj.Fields("field3") & """, """ & memArr(i) & """)"
            DoCmd.RunSQL (InsertSQL)
        Next
        rstObj.MoveNext
    Loop
DoCmd.SetWarnings True
End Sub

I cannot say when exactly it started to take so long, but I can say that changing from Windows 7 to Windows 10 didn't make a difference. I am on Windows 10 for a long time and it still used to work well. Also moving from Access 2007 to 2010 and then to 2019 didn't make a difference, at least not at once.

In order to check where the error could lie I went through the following checklist:

  • I compact the database before starting the function
  • I tried to start Access in Windows 7 compatibility mode
  • I removed unused fields
  • I started the performance analyser and made the changes that were proposed (in two fields I changed the data type)
  • I split the database into a backend only with the tables and a frontend which contains queries and modules
  • I exported the content of the backend into a text file and re-imported it into a newly created backend
  • I stopped the Antivirus while performing the function (although Antivirus used very little processor capacity)

None of that made a notable difference.

Any idea?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
user9
  • 145
  • 1
  • 12
  • Perhaps you could narrow down the issue by [stepping through your code](http://www.cpearson.com/excel/DebuggingVBA.aspx) and take note of which part of your loop is adding the extra time. – ashleedawg Jul 03 '20 at 08:27
  • Are you running this from the front-end (i.e. across the network) or are you running this directly in the back-end? You may also want to set a variable equal to `UBound(memArr)` and use that in the loop rather than calculating this value each time you loop. – Applecore Jul 03 '20 at 08:30
  • I am running the function from the frontend. But before it was one file altogether and it made no difference. I don't know how to set a variable for `Ubound(memArr)` and avoid to redo it each time. – user9 Jul 03 '20 at 09:00
  • Just declare a variable, and then `lngLimit=UBound(memArr) | For i=0 to lngLimit`. – Applecore Jul 03 '20 at 09:04
  • Considering that it worked before and I, therefore, did not change anything in the code, couldn't there be an update of Windows 10 or something that affected the code? Or maybe there is something in the code that is deprecated and should be solved differently. – user9 Jul 03 '20 at 12:12
  • I just tested it with a smaller database replacing `Do While Not rstObj.EOF` `Dim memArr() As String` `memArr = Split(rstObj.Fields("field4"), ",")` `For i = 0 To UBound(memArr)` with `Do While Not rstObj.EOF` `Dim memArr() As String` `Dim lngLimit As String` `memArr = Split(rstObj.Fields("field4"), ",")` `lngLimit = UBound(memArr)` `For i = 0 To lngLimit`. I couldn't notice any difference. I think Access is able to handle that internally. – user9 Jul 03 '20 at 12:14
  • The main issue is going to be running across the network anyway. – Applecore Jul 03 '20 at 12:40
  • I have both, backend and frontend, locally, just as if it was one file. – user9 Jul 03 '20 at 13:01
  • 1
    What happens if you remove `SetWarnings False` and use `dbObj.Execute InsertSQL, dbFailOnError` instead of `DoCmd.RunSQL (InsertSQL)`? Does Access provide any clues which could help explain the slowness? – HansUp Jul 03 '20 at 15:37
  • This may be one situation where multi-value field might be a benefit. Or instead of MVF or CSV, build a normal related dependent table. – June7 Jul 03 '20 at 17:08
  • Have a look [here](https://stackoverflow.com/a/33025620/3820271). *I just did a quick test and 2000 SQL INSERTs took 24 seconds while 2000 DAO.Recordset AddNew inserts into the same table took less than 0.2 seconds. – Gord Thompson* – Andre Jul 05 '20 at 08:46
  • André, thanks for the cntribution. It is true that inserting values in a loop is slower than do an SQL INSERT altogether. However, I am splitting the database. Therefore, I need the loop in order to enter the memArr(i) one by one. Or did I misunderstand something? – user9 Jul 06 '20 at 13:52
  • No, I meant specifically the answer I linked to. Instead of SQL INSERT, try DAO .AddNew, in a loop. – Andre Jul 07 '20 at 12:04

2 Answers2

1

The by far best answer was the one from HansUp. Instead of a whole day it takes a couple of minutes now. I cannot even thank HansUp properly because he put the solution in a side comment.

Surprisingly, there is actually little that I had to change in the code. So, the solution was to modify the code as follows:

Public Sub commasplitfield4()
    Dim rstObj As DAO.Recordset, dbObj As DAO.Database
    Dim InsertSQL As String
    Set dbObj = CurrentDb()
    Set rstObj = dbObj.OpenRecordset("qry-export")
    DoCmd.SetWarnings False
    Do While Not rstObj.EOF
        Dim memArr() As String
        memArr = Split(rstObj.Fields("field4"), ",")
        For i = 0 To UBound(memArr)
            InsertSQL = "INSERT INTO exporttemp(field1,field2,field3,field4) VALUES (""" & rstObj.Fields("field1") _
            & """, """ & rstObj.Fields("field2") _
            & """, """ & rstObj.Fields("field3") & """, """ & memArr(i) & """)"
            'DoCmd.RunSQL (InsertSQL)
            dbObj.Execute (InsertSQL), dbFailOnError 'this line made the difference
        Next
        rstObj.MoveNext
    Loop
'DoCmd.SetWarnings True
End Sub
user9
  • 145
  • 1
  • 12
0

I can't explain the exact cause of your problem, but I think it takes a lot of time to loop through the recordset and loop through the Array.

The task of separating characters with commas seems to be faster using vba in Excel. The example source data was for 1000000 records, The contents separated by each comma were written in two per record, and the records of the converted data were tested with data of 2000000.

  1. Import the original data of Access into Excel (Sheets(1)). (Example table2) ~~> 0.7617188 seconds
  2. Convert the data by separating the data of the imported Sheets(1) with commas. --> Sheets(2) ~~> 21.58594 seconds
  3. Load data from Sheets(2) by Access applicantion. ~~> 5 minutes

  1. Import the original data of Access

 Sub exeSQLgetdata()
 
    Dim Rs As ADODB.Recordset
    Dim strConn As String
    Dim i As Integer
    Dim Fn As String
    Dim Ws As Worksheet
    
    Dim st, et
    
    st = Timer
    
    Set Ws = Sheets(1)
    
    Fn = ThisWorkbook.Path & "\" & "Database9.accdb" '<~~ your database path & name
    
    strConn = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
        "Data Source=" & Fn & ";"
 
    Set Rs = CreateObject("ADODB.Recordset")
    strSQL = "Select field1,field2, field3, field4  from [table2]" '<~~ your raw data table
    
    Rs.Open strSQL, strConn
    
    If Not Rs.EOF Then
        With Ws
            For i = 0 To Rs.Fields.Count - 1
                .Cells(1, i + 1) = Rs.Fields(i).Name
            Next i
            .UsedRange.Offset(1).Clear
            .Range("a2").CopyFromRecordset Rs
        End With
    End If
    Rs.Close
    Set Rs = Nothing
    et = Timer
    
    Debug.Print "Get Data time : " & et - st & " seconds" '<~~ get data from access database
End Sub

  1. Convert the data by separating the data of the imported Sheets(1) with commas.

Sub splitData()
    Dim vR(1 To 1000000, 1 To 4)
    Dim vDB As Variant
    Dim i As Long, n As Long
    Dim k As Variant, v As Variant
    Dim Ws As Worksheet
    Dim toWs As Worksheet
    Dim st, et
    
    st = Timer
    
    Set Ws = Sheets(1)
    Set toWs = Sheets(2)
    
    vDB = Ws.Range("a1").CurrentRegion
    
    For i = 1 To UBound(vDB, 1)
        k = Split(vDB(i, 4), ",")
        For Each v In k
            n = n + 1
            vR(n, 1) = vDB(i, 1)
            vR(n, 2) = vDB(i, 2)
            vR(n, 3) = vDB(i, 3)
            vR(n, 4) = v
        Next v
        DoEvents
    Next i

    With toWs
        .UsedRange.Clear
        .Range("a1").Resize(UBound(vR, 1), UBound(vR, 2)) = vR
    End With
    et = Timer
    Debug.Print "Split time : " & et - st & " seconds"
End Sub
Dy.Lee
  • 7,527
  • 1
  • 12
  • 14
  • Thanks for the contribution. You are probably right that it will be quicker. However, I need to keep the data in Access because I am using the so created database without comma for further queries. Therefore, even if I have an Excel VBA function for that, it will be too much manual work. – user9 Jul 06 '20 at 12:34
  • @user9, When importing external data from Access to a sheet of data converted from Excel, it is saved in Access. – Dy.Lee Jul 06 '20 at 14:31