1

I need help with an massive loop through a continously expanding Access database consisting of approximately 280.000 rows of data. The procedure adds 3000 rows of data every week, and the macros running time is therefore only increasing. It takes around one hour to complete.

What is the optimal way to complete my procedure? I'm experienced with VBA, but SQL knowledge is limited.

The issue summarized is that the If-statement, located in "Help needed here" runs through 280.000 rows of data for 3000 companies.

The goal is that the fresh weekly scores of the company will be scored in JQHistory, but it has to take the date of running the macro into consideration

Note: Everything outside of "Help needed here", I've optimized in another macro. I've left it to hopefully improve the context of the issue.

Here is the non-optimized macro:

Sub OpdaterKvant()
Dim wb As Workbook
Dim ws As Worksheet
Dim DatoIn As Date
Set db = New ADODB.Connection

Set DbEQ = New ADODB.Connection

'The location of the database is determined outside the macro'
strConn = ConnectionString
db.Open strConn

Set wb = Workbooks.Open("My File Location")
Set ws = wb.Worksheets(1)

n = ws.UsedRange.Rows.Count

DateIn = Right(ws.Cells(1, 1), 2) & "-" & Mid(ws.Cells(1, 1), 5, 2) & "-" & Left(ws.Cells(1, 1), 4)

Dato = Format(DateIn, "mm-dd-yyyy")

db.Execute ("DELETE * FROM JQScores")

For i = 3 To n
    Sedol = Replace(ws.Cells(i, 1), " ", "")
    Company = Left(Replace(ws.Cells(i, 2), "'", ""), Len(Replace(ws.Cells(i, 2), "'", "")) - 1)
    Country = Replace(ws.Cells(i, 3), " ", "")
    Region = Replace(ws.Cells(i, 4), " ", "")
    Sector = Replace(ws.Cells(i, 5), " ", "")
    MarketCap = Replace(Replace(ws.Cells(i, 6), " ", ""), ",", ".")
    JQRank = Replace(ws.Cells(i, 7), " ", "")
    ValueRank = Replace(ws.Cells(i, 8), " ", "")
    QualityRank = Replace(ws.Cells(i, 9), " ", "")
    MomentumRank = Replace(ws.Cells(i, 10), " ", "")
    JQScore = Replace(Replace(ws.Cells(i, 11), " ", ""), ",", ".")

    'Inserts the information into the Access database.'
    Sql = "Insert into JQScores (Sedol, Company, Region, Sector, MarketCapUSD, JQ_Rank, Value_Rank, Quality_Rank, Momentum_Rank, JQ_Score, Country) VALUES ('" & Sedol & "','" & Company & "', '" & Region & "', '" & Sector & "', " & MarketCap & ", '" & JQRank & "', '" & ValueRank & "', '" & QualityRank & "', '" & MomentumRank & "', " & JQScore & ", '" & Country & "')"
    db.Execute (Sql)

'*** HELP NEEDED IN THIS SECTION'

    If db.Execute("Select Count(Id) as NumId from JQHistory where Sedol='" & Sedol & "' and history_date=#" & Dato & "#")("NumId") = 0 Then
    Sql = "Insert into JQHistory (History_date, Sedol, Selskabsnavn, JQScore, JQ_Rank, Value_Rank, Momentum_Rank, Quality_Rank, Marketcap) VALUES (#" & Dato & "#, '" & Sedol & "','" & Company & "'," & JQScore & ", '" & JQRank & "', '" & ValueRank & "', '" & MomentumRank & "', '" & QualityRank & "', " & MarketCap & ")"
    db.Execute (Sql)

    Else
    Sql = "Update JQHistory set MarketCap=" & MarketCap & ", Selskabsnavn='" & Company & "' , JQ_Rank='" & JQRank & "', Value_Rank='" & ValueRank & "', Quality_Rank='" & QualityRank & "', Momentum_Rank='" & MomentumRank & "', JQScore=" & JQScore & " WHERE SEDOL='" & Sedol & "' and History_Date=#" & Dato & "#"
    db.Execute (Sql)
    End If

'***'

Next i

db.Close
wb.Close
braX
  • 11,506
  • 5
  • 20
  • 33
piele
  • 78
  • 6
  • Hmm, doesn't look like you can make multiple insert statements with access: https://stackoverflow.com/questions/62504/is-there-any-way-to-create-multiple-insert-statements-in-a-ms-access-query. It's really necessary to delete 100% of the data in the table and then insert 280k+ rows every day? Is there a better way to structure your information or data that doesn't involve that step? – rvictordelta Jun 13 '19 at 13:10
  • You've misunderstood my try to improve the context. It alternates between a database called "JQScores" and "JQHistory". JQScores consists of 3.000 rows being weekly updated, where JQHistory is a timeseries consisting of 280.000 rows of JQScores, being weekly added to show evolution over time :) – piele Jun 13 '19 at 13:26
  • 1
    Instead of executing SQL, open the Access database using _DAO_, and then use methods _AddNew_ and _Update_ to add the records. It will be way faster. Or turn it upside/down and use Access to import the spreadsheet. – Gustav Jun 13 '19 at 13:45
  • Thanks. Do you know a way to filter a column to a specific date using DAO? – piele Jun 13 '19 at 14:31
  • 1
    An Execute won't work with SELECT, only with SQL actions (DELETE, INSERT, UPDATE). In Access, use domain aggregate such as DCount or DLookup or DSum. In Excel open a recordset object and reference field. – June7 Jun 13 '19 at 19:30
  • @Gustav I've been trying to look into using DAO, but I'm having difficulties and many recommend ADO, calling DAO buggy. Can you explain the difference? – piele Jun 14 '19 at 06:31
  • Just browse for it. DAO is native to Access and by no means buggy. ADO can be used as well, but will not be faster. – Gustav Jun 14 '19 at 08:02

1 Answers1

0

The optimal way ended up using the DAO.Recordset and DAO.Database options, and a lot of tweaks for optimization.

The biggest shortcut was using the 'Recordset.FindFirst' to identify if the data should only be added (takes 22 seconds), or update the data with identical date (takes 12 minutes). Although mainly the scenario taking 22 seconds will happen.

The scenario taking 12 minutes is not optimized since it rarely happens.

Full solution:

Sub OpdaterKvant()

Dim wb As Workbook
Dim wbOp As Workbook
Dim ws As Worksheet
Dim wsOp As Worksheet
Dim i, n As Integer

Dim db As DAO.Database
Dim rsScores As DAO.Recordset
Dim rsHistory As DAO.Recordset

StartTime = Timer

Call PERFORMANCEBOOST(False)

Set PB = CREATEPROGRESSBAR
    With PB
        .SetStepCount (4)
        .Show
        End With

    Set wbOp = ThisWorkbook
    Set wsOp = wbOp.ActiveSheet

'Step 1: Open JQGCLE
    Set wb = Workbooks.Open("Location", ReadOnly:=True)
    Set ws = wb.Worksheets(1)
        ws.Activate

    n = ws.UsedRange.Rows.Count

    DateIn = Right(ws.Cells(1, 1), 2) & "-" & Mid(ws.Cells(1, 1), 5, 2) & "-" & Left(ws.Cells(1, 1), 4)

'Step 2: Optag værdier i Excel
    PB.Update "Data hentes fra JQGLCE-ark"

    ReDim Sedol(3 To n) As String
    ReDim Company(3 To n) As String
    ReDim Country(3 To n) As String
    ReDim Region(3 To n) As String
    ReDim Sector(3 To n) As String
    ReDim MarketCap(3 To n) As String 'Tal
    ReDim MarketCapSQL(3 To n) As String 'Tal
    ReDim JQRank(3 To n) As String
    ReDim ValueRank(3 To n) As String
    ReDim QualityRank(3 To n) As String
    ReDim MomentumRank(3 To n) As String
    ReDim JQScore(3 To n) As String 'Tal
    ReDim JQScoreSQL(3 To n) As String 'Tal

    For i = 3 To n

        Sedol(i) = Trim(ws.Cells(i, 1))
        Company(i) = Left(Replace(ws.Cells(i, 2), "'", ""), Len(Replace(ws.Cells(i, 2), "'", "")) - 0) 'Stod tidligere på minus 1 - Hvorfor?
        Country(i) = Trim(ws.Cells(i, 3))
        Region(i) = Trim(ws.Cells(i, 4))
        Sector(i) = Trim(ws.Cells(i, 5))
        MarketCap(i) = ws.Cells(i, 6) 'Til DAO
        MarketCapSQL(i) = Replace(ws.Cells(i, 6), ",", ".") 'Til SQL
        JQRank(i) = Trim(ws.Cells(i, 7))
        ValueRank(i) = Trim(ws.Cells(i, 8))
        QualityRank(i) = Trim(ws.Cells(i, 9))
        MomentumRank(i) = Trim(ws.Cells(i, 10))
        JQScore(i) = ws.Cells(i, 11) 'Til DAO
        JQScoreSQL(i) = Replace(ws.Cells(i, 11), ",", ".") 'Til SQL

        'DAO og SQL bliver behandlet forskelligt ift. komma

        Next i

'Step 3: Indsæt værdier i Access-database
    Set acc = New Access.Application
    Set db = acc.DBEngine.OpenDatabase("Location", 1, 0)

    'Step 3.1: JQScores
        PB.Update "JQScores indsættes i databasen"

        Set rsScores = db.OpenRecordset(Name:="JQScores", Type:=RecordsetTypeEnum.dbOpenDynaset)
        db.Execute "DELETE * FROM JQScores"

        For i = 3 To n

            With rsScores
                .AddNew
                !Sedol = Sedol(i)
                !Company = Company(i)
                !Region = Region(i)
                !Sector = Sector(i)
                !MarketCapUSD = MarketCap(i)
                !JQ_Rank = JQRank(i)
                !Value_Rank = ValueRank(i)
                !Quality_Rank = QualityRank(i)
                !Momentum_Rank = MomentumRank(i)
                !JQ_Score = JQScore(i)
                !Country = Country(i)
                .Update

                End With

            Next i

            rsScores.Close
        Set rsScores = Nothing

    'Step 3.2: JQHistory
        Set rsHistory = db.OpenRecordset(Name:="JQHistory", Type:=RecordsetTypeEnum.dbOpenDynaset)

        With rsHistory

        If .RecordCount <> 0 Then

        i = 3

        .FindFirst "History_Date = '" & DateIn & "'"
            If .NoMatch = True Then
            'Hvis datoen ikke er i datasættet, bliver dataen tilføjet

                PB.Update "Hurtig: JQHistory indsættes i databasen"

                For i = 3 To n
                    .AddNew
                    !History_Date = DateIn
                    !Sedol = Sedol(i)
                    !Selskabsnavn = Company(i)
                    !MarketCap = MarketCap(i)
                    !JQ_Rank = JQRank(i)
                    !Value_Rank = ValueRank(i)
                    !Quality_Rank = QualityRank(i)
                    !Momentum_Rank = MomentumRank(i)
                    !JQScore = JQScore(i)
                    .Update

                    Next i

                Else
                'Hvis datoen allerede er der, skal den opdateres
                    PB.Update "Langsom: JQHistory indsættes i databasen"

                    For i = 3 To n

                        db.Execute ("UPDATE JQHistory SET MarketCap=" & MarketCapSQL(i) & ", Selskabsnavn='" & Company(i) & "', JQ_Rank='" & JQRank(i) & "', Value_Rank='" & ValueRank(i) & "', Quality_Rank='" & QualityRank(i) & "', Momentum_Rank='" & MomentumRank(i) & "', JQScore=" & JQScoreSQL(i) & " WHERE SEDOL='" & Sedol(i) & "' and History_Date='" & DateIn & "'")

                        Next i

                End If

            End If
            End With

            rsHistory.Close
        Set rsHistory = Nothing

'Step 4: Færdiggørelse

    acc.DoCmd.Quit acQuitSaveAll 'Lukker og gemmer database
    Set db = Nothing

    wsOp.Activate
    wsOp.Range("B7").Value = "Seneste data benyttet: " & DateIn
    wb.Close SaveChanges:=False

    Call PERFORMANCEBOOST(True)

    Unload PB

    MinutesElapsed = Format((Timer - StartTime) / 86400, "hh:mm:ss")

    MsgBox "Opdatering fuldført. Proceduren tog " & MinutesElapsed & "."

End Sub
piele
  • 78
  • 6
  • I would recommend a data structure change. You should insert all Excel-Data including a timestamp, then the max timestamp for each History_Date represents your actual data and you get a history of inserted data as you can also query for the max timestamp that is lower that the date you want to see the data at. – ComputerVersteher Jun 14 '19 at 15:04
  • Besides the structure, you can query Excel Ranges with SQL (e.g.`SELECT * FROM [Sheet1$A1: CV5000]`) and then Join to determ if data is existing or not, what leads to 2 queries (insert where joined field is NULL, update what is not). Or create a Recordset (e.g: [Range to Recordset Without Making Connection](https://usefulgyaan.wordpress.com/2013/07/11/vba-trick-of-the-week-range-to-recordset-without-making-connection/) instead of creating an array to loop throug the excel-data. – ComputerVersteher Jun 14 '19 at 15:15
  • 1. Updating data structure; Impossible in my situation because of people dependent on the data. 2. Joining data; Will that make it possible to reduce the updating to 1 query? Even though the updating-scenario is rare, it would be favorable to reduce runtime to seconds instead of 12 minutes. – piele Jun 15 '19 at 16:33
  • 1: Changing data-structure is a long-term goal, can't be done at once but should be keept in mind. 2: If you fetch the Excel data with a SQL Select like`SELECT * FROM [Sheet1$A1: CV5000]`you can join that on other tables. Just adapt the sheet and range and use it as subquery. – ComputerVersteher Jun 15 '19 at 16:39