0

The table seems

group_id   | time              | data  | others...
1           2020-11-30 12:00:00 13
1           2020-11-30 13:00:00 15
2           2020-11-30 12:30:00 254
3           2021-02-21 18:00:00 25565
...

I wanted to get the "data" of "latest time"(which can be different for each group) for each group specified,

so I wrote down like

SELECT group_id, max(time), data
where group_id between (...)
(and others = ...)
group by group_id

but It's data was not the value of that time (actually seems like the latest record)

How can I get the data of maximum time for each group?

안유빈
  • 33
  • 5
  • Does your table have a primary key column? – Asaph Apr 30 '21 at 03:05
  • How do you plan to break ties in the case where there are 2 or more rows in the same group with the same `time`? – Asaph Apr 30 '21 at 03:06
  • surely the table has primary key on (group_id, time) so It would be okay – 안유빈 Apr 30 '21 at 03:08
  • When not using an aggregate function, MySQL is free to retrieve any value from the non grouped columns (data), which usually is the first or last record, depending on index. You would need to filter the result set by the `MAX(time)` and determine a sorting for the `data` column. This is generally done with an exclusion join, [when looking to obtain distinct values (min/max) with corresponding other values](https://stackoverflow.com/questions/5967130/mysql-select-one-column-distinct-with-corresponding-other-columns/26124759#26124759). – Will B. Apr 30 '21 at 03:19

2 Answers2

1

Since you are using a primary key on group_id, time, you can use a standard exclusion join, as opposed to a GROUP BY, since the records are unambiguous.

In this instance to obtain the group_id and MAX(time) with the corresponding other values you would LEFT JOIN table AS b ON a.group_id = b.group_id AND a.time < b.time Then remove the undesired entries by filtering the left join table results with WHERE b.group_id IS NULL

Exclusion Join

DB-Fiddle

SELECT t1.group_id, t1.`time`, t1.`data`
FROM `table` AS t1
LEFT JOIN `table` AS t2
ON t2.group_id = t1.`group_id` 
AND t1.`time` < t2.`time`
WHERE t1.group_id BETWEEN ...
AND t1.others = ...
AND t2.group_id IS NULL;

Result:

| group_id | time                | data  |
| -------- | ------------------- | ----- |
| 1        | 2020-11-30 13:00:00 | 15    |
| 2        | 2020-11-30 12:30:00 | 254   |
| 3        | 2021-02-21 18:00:00 | 25565 |

Alternatively, since you have group_id, time as your primary key, you can use a compound IN subquery criteria with the MAX(time).

Compound IN Subquery

DB-Fiddle

SELECT t1.group_id, t1.`time`, t1.`data`
FROM `table` AS t1
WHERE t1.others = ....
AND (t1.group_id, t1.`time`) IN(
   SELECT group_id, MAX(`time`)
   FROM `table`
   WHERE group_id BETWEEN ....
   GROUP BY group_id
);

Result:

| group_id | time                | data  |
| -------- | ------------------- | ----- |
| 1        | 2020-11-30 13:00:00 | 15    |
| 2        | 2020-11-30 12:30:00 | 254   |
| 3        | 2021-02-21 18:00:00 | 25565 |
Will B.
  • 17,883
  • 4
  • 67
  • 69
  • Thank you for your answer! I tested these two way and found that LEFT JOIN with nullable huge much cost! so I applied INNER JOIN method which use Compound at ON clause and it did the best performance whatever I did. – 안유빈 Apr 30 '21 at 06:09
0

I myself found answer for this using JOIN

with tmp as (
SELECT group_id, max(time) as time FROM table
WHERE group_id between (...)
(and others = ...)
group by group_id
)
SELECT r.data FROM table as r
INNER JOIN tmp ON r.group_id = tmp.group_id and r.time = tmp.time
where group_id between (...)
(and others = ...)
안유빈
  • 33
  • 5