1

I have a table of requests from which I want to select everything, and I have a related table of positionReports where I want to select only the most recent report.

The below SQL does what I want as long as there is only one position report, but as soon as there are more, I understand that the SQL has no idéa which one I want. But I don't know how to specify that.

SELECT
r.id mission_id,
report.Timestamp as reportTimestamp,
report.Latitude as reportLat,
report.Longitude as reportLng,
FROM Requests r
LEFT JOIN PositionReports report ON r.id = report.RequestId

---------------- UPDATE -------------

Guided by this MySQL JOIN the most recent row only? I managed to fetch the most recent one, but then a new problem is that I only get retrieved such missions that actually has position reports.

    WHERE report.Id = (
        SELECT MAX(Id)
        FROM PositionReports
        WHERE RequestId = r.Id
    )";

Is there a way to retrieve all such that don't have an entry in the Reports table as well? Tried outer joins but no difference...

It is a MySQL database (testing environment is MariaDB but production is MySQL (DigitalOcean).

When you suggest pre-aggregating, does that mean that what I aim to accomplish is not possible in 1 SQL query alone?

Sample data -----------------------

So let's say the RequestsTable contains two entries, id's 178 and 179.

id otherdata
178 lorem....
179 ipsum....

Only one of these, 179, has matching rows in the positionReports table.

The PositionReports table contains

id requestId Timestamp Latitude Longitude
2 179 123456700 56.5 11.9
1 179 123456789 57.0 12.0

Desired output (the query should retrieve both). Ideally pick the positionReport with the highest value in Timestamp, but sorting on Id would work too for this implementation as I force chronology in accepting the reports.

requestId Timestamp Latitude Longitude
178
179 123456789 57.0 12.0
GMB
  • 216,147
  • 25
  • 84
  • 135
Matt Welander
  • 8,234
  • 24
  • 88
  • 138
  • 2
    You need to pre-aggregate or filter the rows before joining, probably in a lateral join, assuming your RDBMS supports it. – Stu Jul 04 '23 at 18:39
  • What determines 'most recent' - highest ID in PositionReports or latest date? Sample data and desired output as formatted text table(s) can be helpful. Most likely need a nested subquery. – June7 Jul 04 '23 at 19:05

3 Answers3

1

I think I made it at least work thanks to this Return all data from 1st Table, and only 1 data from 2nd Table if exist

I'm not sure how efficient (or should I say resource-hogging) this solution is.

        SELECT
        r.id as id,
        reports.Timestamp as reportTimestamp,
        reports.Latitude as reportLat,
        reports.Longitude as reportLng,
        FROM Requests r
        LEFT JOIN 
        
        (SELECT 
          c.* 
        FROM
          PositionReports c 
          LEFT JOIN PositionReports d 
            ON c.RequestId = d.RequestId 
            AND c.Timestamp < d.Timestamp 
        WHERE d.RequestId IS NULL) reports
        
        ON r.id = reports.RequestId 
Matt Welander
  • 8,234
  • 24
  • 88
  • 138
1

With no claims about resource consumption - an alternative approach, using the RANK window function:

SELECT
  r.id as id,
  reports.Timestamp as reportTimestamp,
  reports.Latitude as reportLat,
  reports.Longitude as reportLng
FROM RequestsTable r
LEFT JOIN (
  SELECT
    *,
    RANK() OVER (PARTITION BY RequestId ORDER BY Timestamp DESC) as rnk
  FROM PositionReports
  ) reports
  on reports.RequestId = r.id
  and reports.rnk = 1
;

See it in action: DB Fiddle.

Please comment, if and as this requires adjustment / further detail.

Abecee
  • 2,365
  • 2
  • 12
  • 20
0

Is there a way to retrieve all such that don't have an entry in the Reports table as well?

The problem with your code is that you are using the where clause for filtering - so this evicts records that did not match in the left join. You can just move the filtering to the on clause of the join, as in:

select
    r.id mission_id,
    pr.Timestamp as reportTimestamp,
    pr.Latitude as reportLat,
    pr.Longitude as reportLng
from Requests r
left join PositionReports pr 
    on  pr.RequestId = r.id
    and pr.id = ( select max(pr1.id) from PositionReports pr1 where pr1.RequestId = r.Id)

Now we can see that the first join condition seems redondant, since the second condition covers it; this should be good enough:

select
    r.id mission_id,
    pr.Timestamp as reportTimestamp,
    pr.Latitude as reportLat,
    pr.Longitude as reportLng
from Requests r
left join PositionReports pr 
    on pr.id = ( select max(pr1.id) from PositionReports pr1 where pr1.RequestId = r.Id)

Finally: if you are running MySQL 8.0.13 or higher, this looks like a very good spot for a lateral join:

select r.id mission_id,
    pr.Timestamp as reportTimestamp,
    pr.Latitude as reportLat,
    pr.Longitude as reportLng
from Requests r
left join lateral (
    select pr.*
    from PositionReports pr 
    where pr.RequestId = r.Id
    order by id desc limit 1
) pr on 1

The subquery correlates with the outer query, and directly returns the record you are looking for; with an index on PositionReports(RequestId, id desc), this should be an efficient solution.

GMB
  • 216,147
  • 25
  • 84
  • 135