The real question is: Do these records have a one-to-one relationship or a one-to-many relationship?
TLDR Answer:
If one-to-one, use a JOIN
statement.
If one-to-many, use one (or many) SELECT
statements with server-side code optimization.
Why and How To Use SELECT for Optimization
SELECT
'ing (with multiple queries instead of joins) on large group of records based on a one-to-many relationship produces an optimal efficiency, as JOIN
'ing has an exponential memory leak issue. Grab all of the data, then use a server-side scripting language to sort it out:
SELECT * FROM Address WHERE Personid IN(1,2,3);
Results:
Address.id : 1 // First person and their address
Address.Personid : 1
Address.City : "Boston"
Address.id : 2 // First person's second address
Address.Personid : 1
Address.City : "New York"
Address.id : 3 // Second person's address
Address.Personid : 2
Address.City : "Barcelona"
Here, I am getting all of the records, in one select statement. This is better than JOIN
, which would be getting a small group of these records, one at a time, as a sub-component of another query. Then I parse it with server-side code that looks something like...
<?php
foreach($addresses as $address) {
$persons[$address['Personid']]->Address[] = $address;
}
?>
When Not To Use JOIN for Optimization
JOIN
'ing a large group of records based on a one-to-one relationship with one single record produces an optimal efficiency compared to multiple SELECT
statements, one after the other, which simply get the next record type.
But JOIN
is inefficient when getting records with a one-to-many relationship.
Example: The database Blogs has 3 tables of interest, Blogpost, Tag, and Comment.
SELECT * from BlogPost
LEFT JOIN Tag ON Tag.BlogPostid = BlogPost.id
LEFT JOIN Comment ON Comment.BlogPostid = BlogPost.id;
If there is 1 blogpost, 2 tags, and 2 comments, you will get results like:
Row1: tag1, comment1,
Row2: tag1, comment2,
Row3: tag2, comment1,
Row4: tag2, comment2,
Notice how each record is duplicated. Okay, so, 2 comments and 2 tags is 4 rows. What if we have 4 comments and 4 tags? You don't get 8 rows -- you get 16 rows:
Row1: tag1, comment1,
Row2: tag1, comment2,
Row3: tag1, comment3,
Row4: tag1, comment4,
Row5: tag2, comment1,
Row6: tag2, comment2,
Row7: tag2, comment3,
Row8: tag2, comment4,
Row9: tag3, comment1,
Row10: tag3, comment2,
Row11: tag3, comment3,
Row12: tag3, comment4,
Row13: tag4, comment1,
Row14: tag4, comment2,
Row15: tag4, comment3,
Row16: tag4, comment4,
Add more tables, more records, etc., and the problem will quickly inflate to hundreds of rows that are all full of mostly redundant data.
What do these duplicates cost you? Memory (in the SQL server and the code that tries to remove the duplicates) and networking resources (between SQL server and your code server).
Source: https://dev.mysql.com/doc/refman/8.0/en/nested-join-optimization.html ; https://dev.mysql.com/doc/workbench/en/wb-relationship-tools.html