1

EXPLANATION

I am working on a project where my client sends documents (contracts) out to their customers. These documents go through several rounds of "events" as they are send back and forth (ex. Prep, Sent to Client, Received Revised From Client, Signed By Client, etc). For a specific company (companyID 123456), I am trying to pull the most recent event for each document. So the example query below might return say 5 different documents, and each might have 4 different events they went through. In the example results I have simplified it down to a single document (documentId 6789).

Given the example results below, I expect to get the third row. All rows are for a single document, and that one has the most recent date. However, I get four as you see below.

If I Group By the documentID, then I get the right date, but my event and event id values are incorrect (transposed). The only success I have had is to wrap each value in the Select statement in a Max() function. When I do that I get the one row I want with the right event. However, things like the document ID are wrong because of course it returns me the max id, not the one that matches.

Could someone please help me adjust this query so I get the result I need? Thanks in advance!

Note: I found this "solution" here, but I don't think it applies directly to what I am doing: Fetch the row which has the Max value for a column

QUERY


    SELECT e.eventID, 
            e.event, 
            de.documentEventID, 
            de.documentID,  
            Max(de.eventDate) AS eventDate, 
            sd.companyID, 
    FROM siteDocuments sd
        LEFT JOIN documents d ON d.documentID = sd.documentID
        LEFT JOIN documentTypes dt ON dt.documentTypeID = d.documentTypeID
        LEFT JOIN documentEvents de ON de.documentID = sd.documentID
        LEFT JOIN events e ON e.eventID = de.eventID

    WHERE sd.companyID = 123456

    GROUP BY e.eventID

EXAMPLE RESULTS
-----------------------
EventId --- Event Doc --- EventId --- documentId --- eventDate
16 -------- FakeEventA -- 135791 ------ 6789 ------- 2012-04-11 08:35:54
32 -------- FakeEventB -- 726351 ------ 6789 ------- 2012-04-11 08:56:02
24 -------- FakeEventC -- 987236 ------ 6789 ------- 2012-05-09 16:48:57 <======
81 -------- FakeEventD -- 982378 ------ 6789 ------- 2012-04-20 14:06:19

(I put the dashes in to enforce formatting)

Community
  • 1
  • 1

1 Answers1

0

Are you sure the answers on the linked question (specifically this answer) aren't doing exactly what you want?

SELECT sd.* 
FROM siteDocuments AS sd
  LEFT OUTER JOIN siteDocuments AS sd2
    ON (sd.documentId = sd2.documentId AND sd.eventDate < sd2.eventDate) 
WHERE 
    sd2.documentId IS NULL
    and sd.documentId = 6789;

I've used that answer before to achieve what you're looking for. (To maintain performance, you have to be careful to set up indexes correctly on large datasets, however--varies based on the DBMS you're using).

Community
  • 1
  • 1
Matt Winckler
  • 2,223
  • 2
  • 23
  • 27