Here's a query I've written:
SELECT DISTINCT m.*, sm.*, s.*
FROM table_1 m
LEFT JOIN table_2 sm ON m.master_id = sm.master_id
INNER JOIN (
SELECT s1.*, rn.field1, d.field2, m.field3, dt.field4, gl.field5
FROM table_3 s1
LEFT JOIN table_4 rn ON s1.secondary_id = rn.secondary_id
LEFT JOIN table_5 d ON s1.trait_id = d.trait_id
LEFT JOIN table_6 m ON s1.mix_id = m.mix_id
LEFT JOIN table_7 dt ON s1.debit_id = dt.debit_id
LEFT JOIN table_8 gl ON s1.graph_id = gl.graph_id
WHERE s1.secondary_id = 26
AND s1.dimension_id = 24
AND s1.mix_id = 43
) s ON sm.spec_id = s.spec_id
WHERE m.master_id = 1
I'm testing it against a VERY small table (only ~3000-5000 records per table) and getting acceptable results (4.8 ms), from within a VM on my laptop.
My question/concern is what about when the tables get larger, so that the entire DB can't reside in memory?
Obviously, all of the *_id columns have indices on them (whether it's the primary key column or foreign key in the s1 table. I also have a multi column index on s1.secondary_id, s1.dimension_id, s1.mix_id
.
Is this adequate, or would anyone that's more versed than me know if I should employ an different indexes, or use a different strategy altogether, for this sort of query (Query against Table1, with a join to Table2, and another join against a subquery) - it's the last part thats giving me pause about scalability.
Any thoughts are appreciated.
PS - EXPLAIN
states I might be using a temporary table (i assume from the join), but beyond that I"m not sure what I'm looking at, with the types being const
, ref
, eq_ref
, etc.
Again, it works fine right now with a very small amount of test data. I just don't want it to grind to a halt once it gets production levels of data.
Thanks