1

enter image description here

My Data Somewhat Looks like above. I want to mind the latest entry having the maximum log_id as per each group of job_id,run_id,start_hour,end_hour.

I am trying to use the Below Query but unfortunately it is returning the minimum log_id record from the group instead of the maximum.

Please help

select * from 
(select * from job_monitor_log order by job_id,log_id)t1
group by  job_id,run_id,start_hour,end_hour having max(log_id);

Note - The Query should be as per MYSQL

Expected Output as Below-

enter image description here

1 Answers1

1

One canonical way to do this is to join to a subquery which finds the latest log_id value for each group as you have defined it:

SELECT j1.*
FROM job_monitor_log j1
INNER JOIN
(
    SELECT job_id, run_id, start_hour, end_hour, MAX(log_id) AS max_log_id
    FROM job_monitor_log
    GROUP BY job_id, run_id, start_hour, end_hour
) j2
    ON j1.job_id     = j2.job_id AND
       j1.run_id     = j2.run_id AND
       j1.start_hour = j2.start_hour AND
       j1.end_hour   = j2.end_hour AND
       j1.log_id     = j2.max_log_id;

If you can use MySQL 8+ or later, then you may use analytic functions here:

SELECT log_id, job_id, run_id, run_Date, start_hour, end_hour, job_status
FROM
(
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY job_id, run_id, start_hour, end_hour
                           ORDER BY log_id DESC) rn
    FROM job_monitor_log
) t
WHERE rn = 1;

If there could be two or more records per group which are tied regarding having the max log_id value, then you may replace ROW_NUMBER with RANK or DENSE_RANK to include all such ties.

To cover all bases, we could also use a correlated subquery approach, which is along the lines of what you were originally trying to do:

SELECT log_id, job_id, run_id, run_Date, start_hour, end_hour, job_status
FROM job_monitor_log j1
WHERE log_id = (SELECT MAX(t2.log_id)
                FROM job_monitor_log j2
                WHERE j1.job_id     = j2.job_id AND
                      j1.run_id     = j2.run_id AND
                      j1.start_hour = j2.start_hour AND
                      j1.end_hour   = j2.end_hour);

This would include all ties for the maximum log_id value per group. But, this is probably the least performant approach of the three queries given. Sometimes though, when using things like ORM frameworks, we might have the need to express the query as shown above.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • cant it be more simplified and made generic? – Samrat Saha Oct 09 '18 at 09:27
  • 1
    We could also use a correlated subquery, but that might not perform so well, and also would not necessarily be less verbose. What version of MySQL are you using? If you're using 8+ I can suggest an alternative which should also perform a bit better. – Tim Biegeleisen Oct 09 '18 at 09:27
  • @TimBiegeleisen you may add the correlated subquery based solution also +1 – Madhur Bhaiya Oct 09 '18 at 09:51