1

I have three tables, which looks like this

forts

|id|lat|lon|

fort_sightings

|id|fort_id|team|

fort_raids

|id|fort_id|raid_level|

I need a query that fetches all the rows from forts and then select the latest information from fort_sightings and fort_raids, if any. There might be several rows where fort_id has the same value, so I need the latest information.

Currently, I have this, which might not be the prettiest

SELECT
    *
FROM
    forts c

LEFT JOIN fort_sightings o ON o.id = (
    SELECT
        id
    FROM
        fort_sightings
    WHERE
        fort_id = c.id
    ORDER BY
        id DESC
    LIMIT 1
)

LEFT JOIN fort_raids r ON r.id = (
    SELECT
        id
    FROM
        fort_raids
    WHERE
        fort_id = c.id
    ORDER BY
        id DESC
    LIMIT 1
)

But it's painfully slow, the query takes over 10 seconds. There's only ~350 rows in forts, so it really shouldn't take this long. I believe it's from all the SELECT queries in the JOIN, but I don't know any alternative.

EXPLAIN explain query

kopa
  • 341
  • 2
  • 13
  • Every question that deals with slow queries contains the output of `EXPLAIN`. Please, take a look around to see how to get output of `EXPLAIN` and post it here. – Mjh Jun 26 '17 at 15:10
  • Sub queries are always slow, and in this case, you're doing a sub query for each o.id and r.id if I'm not mistaken. Always find a way to avoid sub queries. – Goose Jun 26 '17 at 15:11
  • in order to find why it so slow, use `explain` followed by this query (in the mysql command line), that will tell you which indexes are used and how much record are read. you can either change your code to use indexes or add index if it really needed – Tamar Jun 26 '17 at 15:11
  • I've included an image of `EXPLAIN`. Didn't know about that one. At first `fort_raids` didn't use an index, I added one but it didn't seem to help. – kopa Jun 26 '17 at 15:16
  • 1
    If it was me, I'd start here - but then I suppose I would say that, wouldn't I? [Why should I provide an MCVE for what seems to me to be a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query) – Strawberry Jun 26 '17 at 15:29

3 Answers3

0

This is your query:

SELECT *
FROM forts c LEFT JOIN
     fort_sightings o
     ON o.id = (SELECT fs2.id
                FROM fort_sightings fs2
                WHERE fs.fort_id = c.id
                ORDER BY fs2.id DESC
                LIMIT 1
               ) LEFT JOIN 
     fort_raids r
     ON r.id = (SELECT fr2.id
                FROM fort_raids fr2
                WHERE fr2.fort_id = c.id
                ORDER BY fr2.id DESC
                LIMIT 1
               );

I find it strangely structured. But, see if this works:

  • fort_sightings(fort_id, id)
  • fort_raids(fort_id, id)

It is important that the fort_id be the first key in the indexes.

If this doesn't work, then you might need to modify the query.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • The first key? I didn't know that mattered. I'll try switching the keys around and then running your query, thanks! – kopa Jun 26 '17 at 15:17
  • I have two primary keys, `id` and `fort_id` where `id` is using `AUTO INCREMENT`. If I try to switch them, so `fort_id` is key 1, then I get this: `1075 - Incorrect table definition; there can be only one auto column and it must be defined as a key` – kopa Jun 26 '17 at 15:26
  • You don't want two primary keys. You want the composite indexes that I describe in the answer. – Gordon Linoff Jun 27 '17 at 02:40
0

Aside from the index recommendation from Gordon, you are doing a correlated subquery which means that for every record in the forts table, you are re-running the query to the sightings and raids. I would change slightly to pull all max() per fort from each THEN join... like

SELECT
      c.id,
      c.lat,
      c.lon,
      fs2.team,
      fr2.raid_level
   FROM
      forts c
         LEFT JOIN 
         ( select fs.fort_id, max( fs.id ) as MaxSightID
              from fort_sightings fs
              group by fs.fort_id ) S
           on c.id = s.fort_id
           LEFT JOIN fort_sightings fs2
              on s.MaxSightID = fs2.id
        LEFT JOIN
        ( select fr.fort_id, max( fr.id ) as MaxRaidID
             from fort_raids fs
             group by fr.fort_id ) R
          on c.ID = r.fort_id
          LEFT JOIN fort_raids fr2
             on r.MaxRaidID = fr2.id

The second part of the left-joins goes back to the original raid and sighting tables to pull the corresponding team and raid level in the final results if any such are found.

DRapp
  • 47,638
  • 12
  • 72
  • 142
0

If you need the most efficient query, you should add columns latest_fort_sighting_id latest_fort_raid_id to table forts. MySQL does not have powerful features such as Materialized Views or Hash Joins like PostgreSQL, we need to handle them manually. Do not forget using transaction for updates.

If you limit range of forts, alternatively, you can run optimized query only using LEFT JOIN.

select - SQL join: selecting the last records in a one-to-many relationship - Stack Overflow

SELECT forts.*, fs1.team, fr1.raid_level FROM forts
LEFT JOIN fort_sightings fs1 ON fs1.fort_id = forts.id
LEFT JOIN fort_sightings fs2 ON fs2.fort_id = forts.id AND fs1.id < fs2.id
LEFT JOIN fort_raids fr1 ON fr1.fort_id = forts.id
LEFT JOIN fort_raids fr2 ON fr2.fort_id = forts.id AND fr1.id < fr2.id
WHERE fs2.id IS NULL AND fr2.id IS NULL AND forts.id > 5 ORDER BY forts.id LIMIT 5;

enter image description here

mpyw
  • 5,526
  • 4
  • 30
  • 36