0

Scenario: Let's say I have a transactions table and I want details of only the last transaction for every user.

So, in the example below I want this result:

| userId | trxType | trxDate  |
|--------|---------|----------|
| 1      | c       | 20190103 |
| 2      | a       | 20190104 |

I know this is not a good solution, but would I be right to safely assume that using GROUP BY without an aggregate, on an ordered subquery, would return the correct data in this case (ie. pick fields from the first column for each group)?

I tried the following test and it seems to work fine, but I've read some posts that say they are selected at random ¬_¬ (Unions are there to mimic a table to have a standalone query).

SELECT userid, trxType, trxDate
FROM (
        SELECT *
        FROM (
                SELECT 1 AS userid, 'a' AS trxType, '20190101' AS trxDate
                UNION
                SELECT 1 AS userid, 'b' AS trxType, '20190102' AS trxDate
                UNION
                SELECT 1 AS userid, 'c' AS trxType, '20190103' AS trxDate
                UNION
                SELECT 2 AS userid, 'a' AS trxType, '20190104' AS trxDate
                UNION
                SELECT 2 AS userid, 'b' AS trxType, '20190101' AS trxDate
        ) myTable ORDER BY trxDate DESC
    ) mySubQuery
GROUP BY userid;

Thanks!

Update

As I said, I know this is a bad solution and shouldn't be used. I just wanted to know if unaggregated GROUP BY selects the first item it encounters.

Anyway, in case you're looking for the best solution, I found this useful and clean as it allows for non-unique ordering fields, and also top N records per group if you want more than 1 (just edit the final condition):

SELECT x.userid, x.trxType, x.trxDate
FROM (
    SELECT t.userid, t.trxType, t.trxDate,
    @group_order := IF(@current_group = t.userid, @group_order + 1, 1) as group_order,
    @current_group := t.userid as current_group
    FROM transactions t
    ORDER by t.userid, t.trxDate DESC
) x
WHERE x.group_order = 1;

(IMP: always order by the grouping field first)

  • Never ever ever do this, the results are not predictable. If you're going to aggregate and/or group by, every column in your query needs to be aggregated or grouped. – Andrew Sep 17 '19 at 15:36

1 Answers1

1

You can use a correlated subquery for this:

select t.*
from transactions t
where t.trxDate = (select max(t2.trxDate)
                   from transactions t2
                   where t2.userid = t.userid
                  );

For performance, you want an index on transactions(userid, trxDate).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks. The problem with this solution is when the ordering field is not unique. I added a better solution in the question. – Shawn Xuereb Sep 18 '19 at 11:47