0

There are some similar questions, but none of them matches my case.

SQL Optimization - Join different tables based on column value

How to JOIN on different tables based on column value

MySQL query to JOIN tables based on column values

MySQL: Use CASE/ELSE value as join parameter

MySQL query where JOIN depends on CASE

https://dba.stackexchange.com/questions/53301/mysql-getting-result-using-3-tables-and-case-statements

I have notifications table with this structure

CREATE TABLE `notifications` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `notificaiton_type_id` int(11) DEFAULT NULL,
  `table1_id` int(11) DEFAULT NULL,
  `table2_id` int(11) DEFAULT NULL,
  `table3_id` int(11) DEFAULT NULL,
  `table4_id` int(11) DEFAULT NULL,
  `table5_id` int(11) DEFAULT NULL,
  `user_id` int(11) DEFAULT NULL,
  `created` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `userIdIndex` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

and 5 tables, from table1 to table5, with these structure (others are the same: I set this for testing, not sure if it matters, but those tables (1 to 5) in addition to posted fields have other fields as well, just they do not participate in the query, so for simplicity I just skipped them)

CREATE TABLE `table1` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(300) COLLATE utf8_bin DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=34 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

table*_id is foreign key for tables: table1 - table5 with one to many relationship.

I should select notifications based on user_id. Based on notification type, appropriate table*_id has some value, other foreign_keys are null(btw there are notification types that 2 or even 3 table *_id's can be different from null ). The initial thought was to have a query that would join only those tables, if the foreign key has some value different from null via using CASE, WHEN, but as I learnt from the answer of this question,

MySQL query where JOIN depends on CASE

it can not be used in this case.

Tables table1-table5 are going to be relatively big, having kinda millions or dozens of millions records. So I would not prefer to join extra 2-4 tables if foreign keys are null. Also, I do not think it is any better to separate the query into 2 main parts, like - first getting the notifications and then in a loop find associated tables' values.

So, the point is to only join those tables that table*_id is not null if it can be done in mysql.

The main question is what would be the most efficient way to achieve this - get notification info with its related tables data.

general query with joins to all tables is a usual left join, smth like this

EXPLAIN SELECT 
n.`id`,
n.`user_id`,
n.`table1_id`,
n.`table2_id`,
n.`table3_id`,
n.`table4_id`,
n.`table5_id`
//  other fields
    FROM
      notifications AS n 
      LEFT JOIN table1 AS t1 
        ON t1.`id` = n.`table1_id` 
      LEFT JOIN table2 AS t2
        ON t2.`id` = n.`table2_id`
      LEFT JOIN table3 AS t3
        ON t3.`id` = n.`table3_id`
      LEFT JOIN table4 AS t4
        ON t4.`id` = n.`table4_id` 
      LEFT JOIN table5 AS t5 
        ON t5.`id` = n.`table5_id` 
    WHERE user_id = 5

here is sql fiddle with data http://sqlfiddle.com/#!2/3bf8f/1/0

Thanks

Community
  • 1
  • 1
dav
  • 8,931
  • 15
  • 76
  • 140

3 Answers3

1

I think you are worrying over nothing. MySQL will handle your query, as it is, without any more effort from you.

You state:

I would not prefer to join extra 2-4 tables if foreign keys are null.

Good news: MySQL won't.

It will see that the key is null in the notifications table, see that there are no records in the corresponding table you are joining to, and then just move on. I'm not even sure what you imagine it may be trying to do that you are trying to optimize away, but your query is already optimized as it is.

If you are already running this query and have performance problems, you issue is likely elsewhere. Please provide more information in that case. In particular, your // other fields line may actually affect things more than you think, depending on where those other fields are located.

Willem Renzema
  • 5,177
  • 1
  • 17
  • 24
  • in fact I thought as well that maybe I am overthinking ))), but the idea that today doing smth wrong, tmrrow can bring to performance issues, makes me think twice. Initially I was thinking that mysql will not try to join the table because the foreign key is null, but it made me think about it as the explain statement shows that join happens, even though the number of scanned rows is 1. Actually I just benchmarked with 1.7 mln rows (cant go more) (for each 1-5 tables) and exec time was ~ 0.02 sec, so I guess after all ur right. thanks – dav Jan 08 '15 at 15:38
  • the other fields im sure is not a big deal, it can be as simple data as, name - varchar 200-300, status - tinyint, date - datetime. – dav Jan 08 '15 at 15:40
0

Why not use VIEW for this Left join query?

Here's something more about View's performance: Is a view faster than a simple query?

Assuming that your query works fine, you could create view from it:

CREATE VIEW view_myView AS 
SELECT 
n.`id`,
n.`user_id`,
n.`table1_id`,
n.`table2_id`,
n.`table3_id`,
n.`table4_id`,
n.`table5_id`
    FROM
      notifications AS n 
      LEFT JOIN table1 AS t1 
        ON t1.`id` = n.`table1_id` 
      LEFT JOIN table2 AS t2
        ON t2.`id` = n.`table2_id`
      LEFT JOIN table3 AS t3
        ON t3.`id` = n.`table3_id`
      LEFT JOIN table4 AS t4
        ON t4.`id` = n.`table4_id` 
      LEFT JOIN table5 AS t5 
        ON t5.`id` = n.`table5_id` 
    WHERE user_id = 5

Then you access the data from this view simply by:

SELECT * FROM view_myView;

and it should be faster than calling the query everytime.

It's also much shorter to write as you see.

Community
  • 1
  • 1
Mike W
  • 308
  • 3
  • 5
  • could u please bring a little example-hint how that could be used in this case? tks – dav Jan 08 '15 at 13:28
0

Would it not make more sense to use a single ID as the foreign key then a column for which table to query:

CREATE TABLE `notifications` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `notification_type_id` int(11) DEFAULT NULL,
  `table_id` int(11) DEFAULT NULL,
  `table_name` VARCHAR(10) DEFAULT NULL
...

Then you can select which table to query for the actual data you need.

SELECT `table_id`,`table_name` FROM `notifications`;
SELECT * FROM @table_name WHERE `id`=@table_id;

No expensive LEFT JOINs are necessary in this scenario and two queries (or a compound query as a stored procedure) would negate the need for a large index on the foreign key and so simplify the construct. It also has the advantage of being scalable, for example what if you needed a 6th, 7th or 100th partition table?

Dave
  • 386
  • 2
  • 8
  • thanks for the answer, this can work, just I do not like the idea of 2 queries, because if I have to, then perhaps there is no need to save the table's name in the db, after making the first query, based on notification_type_id I can already know that tables should I join. Unless if this can be done in mysql query itself ? – dav Jan 08 '15 at 13:05