1

I am trying to write a SQL query in Excel that joins two tables (Table A and Table B) using a composite key and returns matching records from Table B, as well as any in Table B that do not match.

The two tables have different dimensions and the data is as follows:

Table A

  ID      Date       Timestamp    Time Event Type   Time Event Type Desc.   ...
 ----- ----------- ------------- ----------------- ----------------------- -----
  123   10/5/2020    8:05:00 AM          P10               Clock In         ...
  123   10/5/2020   11:00:00 AM          P15               Meal Start       ...
  123   10/5/2020   11:30:00 PM          P25               Meal End         ...
  123   10/5/2020    6:30:00 PM          P20               Cock Out         ...
  ...      ...          ...              ...                 ...                      

Table B

  ID      Date      Scheduled Start   Scheduled End   ...   
 ----- ----------- ----------------- --------------- -----
  123   10/5/2020      8:00:00 AM      6:00:00 PM     ...
  123   10/6/2020     10:00:00 AM      4:00:00 PM     ...
  123   10/7/2020      9:00:00 AM      4:00:00 PM     ...
  123   10/8/2020     12:00:00 AM     12:00:00 AM     ...
  123   10/9/2020     12:00:00 PM      7:00:00 PM     ...
  124   10/5/2020      9:00:00 AM      4:00:00 PM     ...
  ...      ...            ...             ...       

The composite key is a combination of ID and Date. With the code below I am able to join the two tables and output a table that shows (for each ID on each day they clocked in) the Scheduled Start time, Clock In time, Scheduled End time, Clock Out time, and some other columns.

sql_query = _
        "SELECT [a].[ID], FORMAT([a].[Date], 'mm/dd/yyyy'), " & _
            "FORMAT([b].[Scheduled Start], 'hh:mm:ss AM/PM'), " & _
            "FORMAT(MIN(CDATE([a].[Timestamp])),'hh:mm:ss AM/PM') AS [Clock In Time], " & _
            "FORMAT([b].[Scheduled End], 'hh:mm:ss AM/PM'), " & _
            "FORMAT(MAX(CDATE([a].[Timestamp])),'hh:mm:ss AM/PM') AS [Clock Out Time] " & _
        "FROM [CLOCKINREPORT#csv] AS [a] " & _
        "LEFT JOIN [SCHEDULEREPORT#csv] AS [b] " & _
        "ON [a].[ID] = [b].[ID] AND [b].[Date] = [a].[Date] " & _
        "WHERE ([a].[Date] BETWEEN #" & Format(sWeekPer, "yyyy-mm-dd") & "# AND #" & Format(eWeekPer, "yyyy-mm-dd") & _
            "#) AND ([a].[Time Event Type] = 'P10' OR [a].[Time Event Type] = 'P20')" & _
        "GROUP BY [a].[ID], [a].[Date], [b].[Scheduled Start], [b].[Scheduled End] "

Output Table

  ID      Date      Scheduled Start   Clock In Time   Scheduled End    Clock Out Time   ...   
 ----- ----------- ----------------- --------------- ---------------  ----------------
  123   10/5/2020     8:00:00 AM       8:05:00 AM      6:00:00 PM        6:30:00 PM     ...
  ...      ...            ...             ...             ...               ...

What I would also like to show records on days that the ID was scheduled but did not work (i.e., if '123' was scheduled on '10/6/2020' but no combination of the composite key, ID and Date, are not found in Table A, null values should be displayed in the Clock In Time and Clock Out Time columns). One issue that is present in the data is that on days when an ID did not clock in or clock out, there are no records associated with that date in Table A.

Desired Output

  ID      Date      Scheduled Start   Clock In Time   Scheduled End    Clock Out Time   ...   
 ----- ----------- ----------------- --------------- ---------------  ----------------
  123   10/5/2020     8:00:00 AM       8:05:00 AM      6:00:00 PM        6:30:00 PM     ...
  123   10/6/2020    10:00:00 AM          NULL         4:00:00 PM           NULL        ...
  ...      ...            ...             ...             ...               ...

My understanding is that Microsoft ACE 12.0 does not support a FULL JOIN and requires a UNION to pull off what I would like to do. I must admit I am quite the beginner in SQL and the UNION's I have tried based on a few stackoverflow questions have not worked. I would much appreciate any help solving this problem.

Parfait
  • 104,375
  • 17
  • 94
  • 125
Terrence
  • 33
  • 4
  • 1
    `SELECT .. FROM A LEFT JOIN B ... UNION SELECT .. FROM A RIGHT JOIN B ...`. If the output does not contain full duplicated rows then the output wil be correct. – Akina Dec 04 '20 at 15:47

1 Answers1

1

Consider the workaround query for FULL JOIN in MS Access. However, for readability and maintainability avoid messy concatenation, line breaks, quote handling, and date # enclosures in VBA. Instead, save your SQL in a separate .SQL file or better for efficiency save the query in the corresponding MS Access database and run ADO parameterization on the BETWEEN dates.

SQL (save as MS Access stored query or .SQL file; notice ? for parameter placeholders)

SELECT c.[ID]
     , FORMAT(c.[Date], 'mm/dd/yyyy')
     , FORMAT(s.[Scheduled Start], 'hh:mm:ss AM/PM')
     , FORMAT(MIN(CDATE(c.[Timestamp])),'hh:mm:ss AM/PM') AS [Clock In Time]
     , FORMAT(s.[Scheduled End], 'hh:mm:ss AM/PM')
     , FORMAT(MAX(CDATE(c.[Timestamp])),'hh:mm:ss AM/PM') AS [Clock Out Time]
       
FROM [CLOCKINREPORT#csv] AS c
LEFT JOIN [SCHEDULEREPORT#csv] AS s
   ON c.[ID] = s.[ID] AND s.[Date] = c.[Date]
WHERE (c.[Date] BETWEEN ? AND ?)
  AND (c.[Time Event Type] = 'P10' OR c.[Time Event Type] = 'P20')
GROUP BY c.[ID]
       , c.[Date]
       , s.[Scheduled Start]
       , s.[Scheduled End] 

UNION

SELECT s.[ID]
     , FORMAT(c.[Date], 'mm/dd/yyyy')
     , FORMAT(s.[Scheduled Start], 'hh:mm:ss AM/PM')
     , FORMAT(MIN(CDATE(c.[Timestamp])),'hh:mm:ss AM/PM') AS [Clock In Time]
     , FORMAT(s.[Scheduled End], 'hh:mm:ss AM/PM')
     , FORMAT(MAX(CDATE(c.[Timestamp])),'hh:mm:ss AM/PM') AS [Clock Out Time]
       
FROM [CLOCKINREPORT#csv] AS c
RIGHT JOIN [SCHEDULEREPORT#csv] AS s
   ON c.[ID] = s.[ID] AND s.[Date] = c.[Date]
WHERE (c.[Date] BETWEEN ? AND ?)
  AND (c.[Time Event Type] = 'P10' OR c.[Time Event Type] = 'P20')
GROUP BY s.[ID]
       , c.[Date]
       , s.[Scheduled Start]
       , s.[Scheduled End] 

VBA (using ADO to connect to Access database)

Sub RunSQL()
   ' REFERENCE THE MICROSOFT ACTIVEX DATA OBJECTS XX.X LIBRARAY '
   Dim conn As ADODB.Connection, cmd As New ADODB.Command, rs As ADODB.Recordset
   Dim StrQuery As String

   ' READ SQL QUERY FROM FILE
   With CreateObject("Scripting.FileSystemObject")
         StrQuery = .OpenTextFile("C:\Path\To\myQuery.sql", 1).readall
   End With

   ' OPEN DB CONNECTION
   Set conn = New ADODB.Connection
   conn.Open "DRIVER=Microsoft Access Driver (*.mdb, *.accdb);DBQ=C:\Path\To\Database\File.accdb;"
   
   ' DEFINE COMMAND OBJECT
   Set cmd = New ADODB.Command
   With cmd
       .ActiveConnection = conn
       .CommandType = adCmdText   ' OR BETTER IF USING SAVED QUERY: adCmdStoredProc
       .CommandText = StrQuery    ' OR BETTER IF USING SAVED QUERY: "myStoredAccessQuery"

       ' BIND 4 DATE PARAMETERS FOR FOUR ? IN SQL, ASSUMING sWeekPer AND eWeekPer ARE VBA DATES
       .Parameters.Append .CreateParameter("dtparam1", adDate, adParamInput, , sWeekPer)
       .Parameters.Append .CreateParameter("dtparam2", adDate, adParamInput, , eWeekPer)
       .Parameters.Append .CreateParameter("dtparam3", adDate, adParamInput, , sWeekPer)
       .Parameters.Append .CreateParameter("dtparam4", adDate, adParamInput, , eWeekPer)
   End With

   ' BIND OUTPUT TO RECORDSET
   Set rs = cmd.Execute

   ' ... USE rs AS NEEDED ...

   ' CLOSE AND RELEASE OBJECTS
   rs.Close: conn.Close
   Set rs = Nothing: Set cmd = Nothing: Set conn = Nothing
End Sub
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Appreciate the response. Unfortunately, the query has to be run soley on Excel. – Terrence Dec 07 '20 at 18:57
  • Actually, Excel is solely running the query. Access just stores the SQL. You store tables in Access, why not queries? And to be clear by saving an Access query you cache the statistics and best execution plan so engine optimizes the UNION + JOIN. – Parfait Dec 07 '20 at 19:13
  • Nonetheless, use the read from .SQL file route or save the SQL string in an Excel cell and reference `StrQuery = Worksheets("mySheet").Range("A1")`. To be clear, this solution shows two ways: 1) read from .sql and 2) retrieve stored Access query. – Parfait Dec 07 '20 at 19:15