6

I have a history table that captures updates to a certain object and, in addition to other information, captures the time this update happened. What I would like to do is SELECT the MIN(LogDate) corresponding to a certain ActionTaken column.

More specifically, the history table may have other (more recent) rows where ActionTaken = 1, but I want to capture the date ActionTaken became 1.

Example:

SELECT  MIN(LogDate) AS FirstActionDate
FROM    HistoryTable
WHERE   ID = 123
    AND FirstActionTaken = 1

SELECT  MIN(LogDate) AS SecondActionDate
FROM    HistoryTable
WHERE   ID = 123
    AND SecondActionTaken = 1

SELECT  MIN(LogDate) AS ThirdActionDate
FROM    HistoryTable
WHERE   ID = 123
    AND ThirdActionTaken = 1

This works well, and I receive the proper dates without issue. Where I'm running into trouble is then going to select the MAX(LogDate) from this group:

SELECT  MAX(LogDate) AS LastActionDate
FROM    HistoryTable
WHERE   ID = 123
    AND LogDate IN 
    (
            (   SELECT  MIN(LogDate) AS FirstActionDate
                FROM    HistoryTable
                WHERE   ID = 123
                    AND FirstActionTaken = 1    ),

            (   SELECT  MIN(LogDate) AS SecondActionDate
                FROM    HistoryTable
                WHERE   ID = 123
                    AND SecondActionTaken = 1   ),

            (   SELECT  MIN(LogDate) AS ThirdActionDate
                FROM    HistoryTable
                WHERE   ID = 123
                    AND ThirdActionTaken = 1    )
    )

This also works, but I hate doing it this way. I could save out the previous statements into variables and just SELECT MAX() from those; it would certainly be more readable, but what would the JOIN syntax look like for this query?

Is there a way to combine the first three SELECT statements into one that returns all three dates and isn't an unreadable mess?

How can I grab the most recent LogDate (as a separate column) from this result set and without the (seemingly unnecessary) repeating SELECT statements?

EDIT:

Here are a few links I've found in relation to the answers that have been given so far:

Hopefully these will help with others looking for solutions to similar problems!

Community
  • 1
  • 1
levelonehuman
  • 1,465
  • 14
  • 23

5 Answers5

3

This would be easier with a normalized data structure. Here is one method that uses conditional aggregation to calculate the three minimum dates. Then it takes the maximum of those values:

SELECT v.dt
FROM (SELECT MIN(CASE WHEN FirstActionTaken = 1 THEN LogDate END) AS d1,
             MIN(CASE WHEN SecondActionTaken = 1 THEN LogDate END) AS d2,
             MIN(CASE WHEN ThirdActionTaken = 1 THEN LogDate END) AS d3      
     FROM HistoryTable
     WHERE ID = 123
    ) ht OUTER APPLY
    (SELECT MAX(dt) as dt
     FROM (VALUES (d1), (d2), (d3) ) v(dt)
    ) v;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
2

EDIT 2

Based on new information that can be gleaned from OP's own answer (about how to define the latest action date), the query can be further simplified to simply this:

select coalesce(
         min(case when ThirdActionTaken = 1 then LogDate end),
         min(case when SecondActionTaken = 1 then LogDate end),
         min(case when FirstActionTaken = 1 then LogDate end)
       ) as LastActionDate
  from HistoryTable
 where id = 123

Unpivot can also be used:

 select max(ActionDate)
   from (select min(case when FirstActionTaken = 1 then LogDate end) as FirstActionDate,
                min(case when SecondActionTaken = 1 then LogDate end) as SecondActionDate,
                min(case when ThirdActionTaken = 1 then LogDate end) as ThirdActionDate
           from HistoryTable
          where id = 123) t
unpivot (ActionDate for ActionDates in (FirstActionDate, SecondActionDate, ThirdActionDate)) unpvt

EDIT: Short explanation

This answer is very similar to Gordon's in that it uses conditional aggregation to get the 3 minimum dates in one query.

So the following part of the query:

select min(case when FirstActionTaken = 1 then LogDate end) as FirstActionDate,
       min(case when SecondActionTaken = 1 then LogDate end) as SecondActionDate,
       min(case when ThirdActionTaken = 1 then LogDate end) as ThirdActionDate
  from HistoryTable
 where id = 123

...might return something like...

FirstActionDate   SecondActionDate   ThirdActionDate
---------------   ----------------   ---------------
     2015-01-01         2015-12-01            (null)

Then, the unpivot clause is what "unpivots" the 3 columns into a result set with 3 rows but a single column instead:

ActionDate
----------
2015-01-01
2015-12-01
    (null)

Once the results are in this format, then a simple max aggregate function (select max(ActionDate)) can be applied to get the max value of the 3 rows.

sstan
  • 35,425
  • 6
  • 48
  • 66
  • This works great, and feels much nicer than my own attempt at it :) Would you mind adding some explanation about what's happening here? Especially in relation to `UNPIVOT`. Thanks for the answer! – levelonehuman Dec 10 '15 at 16:28
  • Excellent, thanks for the update on the answer - and apologies for not specifying that to begin with! – levelonehuman Dec 14 '15 at 14:03
1

You can use a UNION to join the 3 queries for your IN statement.

Something like

SELECT
    MAX(ht1.LogDate) AS LastActionDate
FROM
    HistoryTable ht1
WHERE
    ht1.ID = 123
    AND ht1.LogDate IN (SELECT
                        MIN(LogDate) AS FirstActionDate
                    FROM
                        HistoryTable ht2
                    WHERE
                        ht2.ID = ht1.ID
                        AND ht2.FirstActionTaken = 1
                    UNION
                    SELECT
                        MIN(LogDate) AS FirstActionDate
                    FROM
                        HistoryTable ht2
                    WHERE
                        ht2.ID = ht1.ID
                        AND ht2.SecondActionTaken = 1
                    UNION
                    SELECT
                        MIN(LogDate) AS FirstActionDate
                    FROM
                        HistoryTable ht2
                    WHERE
                        ht2.ID = ht1.ID
                        AND ht2.ThirdActionTaken = 1)
JamieD77
  • 13,796
  • 1
  • 17
  • 27
0

You can solve this problem without using PIVOT. The following code extends your initial code to store the MIN values into variables and then calculates the max value among them:

DECLARE @FirstActionDate  DATETIME = NULL;
DECLARE @SecondActionDate DATETIME = NULL;
DECLARE @ThirdActionDate  DATETIME = NULL;
DECLARE @LastActionDate   DATETIME = NULL;

SELECT  @FirstActionDate = MIN(LogDate)
FROM    HistoryTable
WHERE   ID = 123
    AND FirstActionTaken = 1

SELECT  @SecondActionDate = MIN(LogDate)
FROM    HistoryTable
WHERE   ID = 123
    AND SecondActionTaken = 1

SELECT  @ThirdActionDate = MIN(LogDate)
FROM    HistoryTable
WHERE   ID = 123
    AND ThirdActionTaken = 1

-- calculate @LastActionDate as the greater from @FirstActionDate, @SecondActionDate and @ThirdActionDate. 
SET @LastActionDate = @FirstActionDate;
IF (@SecondActionDate > @LastActionDate) SET @LastActionDate = @SecondActionDate;
IF (@ThirdActionDate > @LastActionDate)  SET @LastActionDate = @ThirdActionDate;

SELECT @FirstActionDate AS [FirstActionDate]
, @SecondActionDate     AS [SecondActionDate]
, @ThirdActionDate      AS [ThirdActionDate]
, @LastActionDate       AS [LastActionDate]

If you want the absolute last action date, you can change the original code to just a single statement, as follows:

SELECT MAX(LogDate) AS [LastActionDate]
, MIN(CASE WHEN FirstActionTaken = 1  THEN LogDate ELSE NULL END) AS [FirstActionDate]
, MIN(CASE WHEN SecondActionTaken = 1 THEN LogDate ELSE NULL END) AS [SecondActionDate]
, MIN(CASE WHEN ThirdActionTaken = 1  THEN LogDate ELSE NULL END) AS [ThirdActionDate]
FROM    HistoryTable
WHERE   ID = 123
Gerardo Lima
  • 6,467
  • 3
  • 31
  • 47
0

My own attempt at refactoring the final SELECT statement:

SELECT  MIN(ht2.LogDate) AS FirstActionDate,
        MIN(ht3.LogDate) AS SecondActionDate,
        MIN(ht4.LogDate) AS ThirdActionDate,
        COALESCE (
            MIN(ht4.LogDate),
            MIN(ht3.LogDate),
            MIN(ht2.LogDate)
        ) AS LastActionDate
FROM    HistoryTable ht
    INNER JOIN HistoryTable ht2 
        ON ht2.ID = ht.ID AND ht2.FirstActionTaken = 1
    INNER JOIN HistoryTable ht3 
        ON ht3.ID = ht.ID AND ht3.SecondActionTaken = 1
    INNER JOIN HistoryTable ht4 
        ON ht4.ID = ht.ID AND ht4.ThirdActionTaken = 1
WHERE   ht.ID = 123
GROUP BY ht.ID

This JOINS back to HistoryTable for each xActionTaken column and SELECTS the MIN(LogDate) from each. Then, we walk backwards through the results (ThirdAction, SecondAction, FirstAction) and return the first one we find as LastActionTaken.

Admittedly this is a bit messy, but I thought it would be good to show another alternative to retrieving the same data.

Also worth noting for performance:

After running my answer against the UNPIVOT and OUTER APPLY methods, SSMS Execution Plan shows that UNPIVOT and OUTER APPLY are roughly equal (taking approx. 50% of the execution time each).

When comparing my method to either of these, my method takes approx. 88% of the execution time, where UNPIVOT/OUTER APPLY only take 12% - so both UNPIVOT and OUTER APPLY execute much faster (at least in this instance).

The reason that my method takes so much longer is that SQL does a table scan of HistoryTable for each time I join back to it, for a total of 4 scans. With the other two methods, this action is only performed once.

levelonehuman
  • 1,465
  • 14
  • 23
  • Your question never specified that `ThirdActionDate > SecondActionDate > FirstActionDate`, but your answer implies that this is always true. That being the case, you don't need any unpivot, outer apply or joins. The query only needs conditional aggregation with a coalesce. I edited my answer. That should give you the best performance. – sstan Dec 11 '15 at 15:02