0

I have the following tables:

Task (id,....)

TaskPlan (id, task_id,.......,end_at)

Note that end_at is a timestamp and that one Task has many TaskPlans. I need to query for the MAX end_at for each Task.

This query works fine, except when you have the same exact timestamp for different TaskPlans. In that case, I would be returned multiple TaskPlans with the MAX end_at for the same Task.

I know this is an unlikely situation, but is there anyway I can limit the number of results for each task_id to 1?

My current code is:

SELECT * FROM Task AS t
INNER JOIN (
SELECT * FROM TaskPlan WHERE end_at in (SELECT MAX(end_at) FROM TaskPlan GROUP BY task_id )
) AS pt 
ON pt.task_id = t.id
WHERE status = 'plan';

This works, except in the above situation, how can this be achieved? Also in the subquery, instad of SELECT MAX(end_at) FROM TaskPlan GROUP BY task_id, is it possible to do something like this so I can use TaskPlan.id for the where in clause?

SELECT id, MAX(end_at) FROM TaskPlan GROUP BY task_id

When I try, it gives the following error:

SQL Error [1055] [42000]: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'TaskPlan.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

Any explaination and suggestion would be much welcome!

Note on duplicate label: (Now reopened)

I already studied the this question, but it does not provide an answer for my situation where there are multiple max values in the result and it needs to be filtered out to include only one result row per group.

toing_toing
  • 2,334
  • 1
  • 37
  • 79
  • 2
    Use `SELECT taskid, MAX(end_at) FROM TaskPlan GROUP BY task_id` or `SELECT id, MAX(end_at) FROM TaskPlan GROUP BY id` instead of `SELECT id, MAX(end_at) FROM TaskPlan GROUP BY task_id` – Barbaros Özhan May 29 '18 at 20:23

3 Answers3

1

Use the id rather than the timestamp:

SELECT *
FROM Task AS t INNER JOIN
     (SELECT tp.*
      FROM TaskPlan tp
      WHERE tp.id = (SELECT tp2.id FROM TaskPlan tp2 WHERE tp2.task_id = tp.task_id ORDER BY tp2.end_at DESC LIMIT 1)
     ) tp 
     ON tp.task_id = t.id
WHERE status = 'plan';

Or use in with tuples:

SELECT *
FROM Task AS t INNER JOIN
     (SELECT tp.*
      FROM TaskPlan tp
      WHERE (tp.task_id, tp.end_at) in (SELECT tp2.task_id, MAX(tp2.end_at)
                                        FROM TaskPlan tp2 
                                        GROUP BY tp2.task_id
                                       )
     ) tp 
     ON tp.task_id = t.id
WHERE status = 'plan';
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you for your answer. However, it still gives me these issues. For the first one, i'm told `This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery`. I understand this could be specific to my version, but the second one also will have my previous issue in the case of multiple Task Plans with the same `task_id` and `end_at`. :) – toing_toing May 29 '18 at 20:37
  • 1
    @toing_toing . . . The first one should be `=`, not `in`. – Gordon Linoff May 30 '18 at 02:26
1

If you want to get a list of task ID's with MAX end_at for each, run the query below:

SELECT t.id, MAX(tp.end_at) FROM Task t JOIN TaskPlan tp on t.id = tp.task_id GROUP BY t.id;

EDIT:

Now, I know what exactly you are going to do. If the TaskPlan table is so big, you can avoid the 'GROUP BY' and run the query below that is very efficient:

SET @first_row := 0;
SET @task_id := 0;

SELECT * FROM Task t JOIN (
   SELECT tp.*
    , IF(@task_id = tp.`task_id`, @first_row := 0, @first_row := 1) AS temp
    , @first_row AS latest_record
    , @task_id := tp.`task_id`

    FROM TaskPlan tp ORDER BY task_id, end_at DESC) a  ON t.task_id = a.task_id AND a.latest_record = 1;
Abbas Hosseini
  • 651
  • 6
  • 10
0

Try this query:

select t.ID , tp1.end_at
from TASK t
left join TASKPLAN tp1 on t.ID = tp1.id 
left join TASKPLAN tp2 on t.ID = tp2.id and tp1.end_at < tp2.end_at
where tp2.end_at is null;