Retrieving either indexed data or the last record in each group

Question

There is a table routes that contains data as shown below:

Id   GroupID   TickID   Data
-------------------------
1    1         1        A
2    1         2        B
3    1         3        C
4    2         1        D
5    2         2        E
6    3         1        F

I have a PHP file where the user should be able to pass a "TickID", and than the script should return: For every GroupID the Data for the given TickID by the user, but if the user passes a TickID which is bigger than the maximum TickID in a certain GroupID, than the LAST row for that particular GroupID should be returned.

i.e.

UserInput, TickID 1, returns

Id   GroupID   TickID   Data
-------------------------
1    1         1        A
4    2         1        D
6    3         1        F

UserInput, TickID 3, returns

Id   GroupID   TickID   Data
-------------------------
3    1         3        C
5    2         2        E
6    3         1        F

I found how to select the "latest" row, i.e. the biggest TickID per GroupID with a good performance already here: Retrieving the last record in each group - MySQL

But I'm not sure how to combine it with a selection of rows inbetween.

EDIT What I have right now is: Make 3 queries:

1st query: Select biggest TickID for every group SELECT GroupID, MAX(TickID) maxTick From routes GROUP BY GroupID

Than do two queries: One for the GroupsIDs which do have a TickID entry in the database bigger than the userinput -> SELECT GroupID, Data From routes WHERE TickID = userTickID GROUP BY GroupID

And do another query for the Groups which do not have a TickID as big as given by the user: Select the biggest TickIDs for those Groups. Similarly to Retrieving the last record in each group - MySQL

Seems a big non-performant. Any ideas to improve it?

Tim Biegeleisen · Accepted Answer · 2020-07-19T10:41:22.997

1

I would phrase your query as:

SELECT r1.*
FROM routes r1
INNER JOIN
(
    SELECT GroupID, MAX(TickID) AS MaxTickID
    FROM routes
    GROUP BY GroupID
) r2
    ON r1.GroupID = r2.GroupID AND
       (r1.TickID = userTickID OR
        (r1.TickID = r2.MaxTickID AND r2.MaxTickID < userTickID));

The logic here is a slight play on the canonical max per group join approach. Here we take the TickID which is the max in the group only if the user tick ID be greater than the max. Otherwise, we take the record whose TickID matches the input user value.

Adding the following index on routes should dramatically speed up the max subquery:

CREATE INDEX idx ON routes (GroupID, TickID);

edited Jul 19 '20 at 10:41

answered Jul 19 '20 at 10:25

Tim Biegeleisen

502,043
27
286
360

Thanks, I'll check it. `t1` in the two last lines of the query should be `r1` I suppose :) And `t2` likewise :) But it seems to work. How to make it max. performant? Adding an index (GroupID, TickID) on the table? – tim Jul 19 '20 at 10:29
1

@tim Check the updated answer. Great first name, by the way, reminds me of myself `:-P` – Tim Biegeleisen Jul 19 '20 at 10:38
Thanks, yes, I had this index. And the speed is great. Very good query, thanks a lot! – tim Jul 19 '20 at 10:38

score 1 · Answer 2 · answered Jul 19 '20 at 12:22

I think that this is depended by data.

(A) Are ids of TickID always continuous? e.g. TickID=1,2,3,4,5,6
(B) May numbers be skipped in some cases? e.g. TickID=1,4,6

(A) the query is as follows.

select * from routes
  where (GroupID, TickID) in
    (select GroupID, max(TickID) from routes where TickID<=3 group by GroupID);

(B) the query is as follows.

with
MatchUserInput as (
  select GroupID, TickID from routes where TickID=3
),
MaxTickID as (
  select GroupID, max(TickID) from routes
    where GroupID not in (select GroupID from MatchUserInput) group by GroupID
)
select * from routes
where (GroupID, TickID) in (
  select * from MatchUserInput
  union select * from MaxTickID
);

Replace the above "3" with UserInput.

Retrieving either indexed data or the last record in each group

2 Answers2