0

I'm managing a DB and I'd like to know if there is a cooler way to make this query.

My actual query:

SELECT * FROM mytable 
LEFT JOIN table1 AS m1 ON mytable.idA=m1.K 
LEFT JOIN table1 AS m2 ON mytable.idB=m2.K 
LEFT JOIN table1 AS m3 ON mytable.idC=m3.K 
LEFT JOIN table1 AS m4 ON mytable.idD=m4.K 
LEFT OUTER JOIN table2 AS t1 ON mytable.idK1= t1.idK1 AND mytable.idK2= t1.idK2
LEFT OUTER JOIN table3 ON t1.idT = table3.idT
WHERE m1.K = "my_value" OR m2.K = "my_value" OR m3.K = "my_value" OR m4.K = "my_value"

I have a query that make 4 joins on the same other table because I can have the value in the field idA, idB, idC or idD. After that I make other 2 joins with other tables.

My question is:
Can I improve this code? Is there a smarter way to do it or a better logic to follow? Thanks a lot, even for the theoretical explanations.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
emish89
  • 696
  • 8
  • 25
  • have you proper index on the tables? – ScaisEdge Jun 15 '16 at 19:23
  • I would (1) make sure to index idA etc. and (2) do one join with `ON idA=K OR idB=K OR idC=K OR idD=K`, which is not quite the same login but may do what you want. Think about it. – Andrew Lazarus Jun 15 '16 at 19:24
  • 2
    and call column by names. * is too expensive – NinjaDeveloper Jun 15 '16 at 19:24
  • What are you doing with the values from `idA`,`idB`,`idC`,`idD`? – Hart CO Jun 15 '16 at 19:28
  • This looks like you could have a second table and just a three table join without that many conditions. You could have table1 and a table with two references to table1, then make it a 1-many relation. Could save your db from iterating through the same table recursively many times. – Felype Jun 15 '16 at 19:30
  • value are not indexed. It's better to index them? I don't know the pros and cons. – emish89 Jun 15 '16 at 19:51
  • The call is by columns, I put "*" to short the code :) – emish89 Jun 15 '16 at 19:52
  • What do you think is the difference between `LEFT JOIN` and `LEFT OUTER JOIN`? Why are you using both terms when they mean the same? Consistency is good — use the same term throughout a single query. (Answer: they are two spellings for the same operation.) – Jonathan Leffler Jun 15 '16 at 19:52
  • 4
    You are killing your `LEFT JOIN`s by transforming them into `INNER JOIN`s via your `WHERE` clause. This has nothing to do with your performance, but your query is *not* doing what you think it's doing. – Siyual Jun 15 '16 at 19:56
  • This is a nice demonstration of why denormalized tables are such a pain to write queries against. The repeating `idX` columns should be in a separate table; then there'd be no problem with simple joining. You wouldn't even need to use left outer joins (by any spelling) for that to work. – Jonathan Leffler Jun 15 '16 at 19:57
  • I find this query and I try to improve it, I didn't wrote it :) – emish89 Jun 15 '16 at 20:04
  • So, if I understand your idea, how can I put idA, idB, idC and idD in another table? – emish89 Jun 15 '16 at 20:06
  • @Siyual : How can I solve the problem? the query now is working but it's slow... why you think it's not doing what he does? – emish89 Jun 15 '16 at 20:08
  • @emish89 `OUTER JOIN`s will return either the row (when found) or a `NULL` row (when not found) from the opposite table (from the right table for a `LEFT JOIN`, and from the left table for a `RIGHT JOIN`). The `WHERE` clause is executed *after* those results are returned - thus, your filter on those right-hand tables will filter the *results* of the `OUTER JOIN`ed table (effectively making it an `INNER JOIN`). To remedy this, you need to include those `WHERE` conditions in the `ON` clauses of the `OUTER JOIN`ed tables. E.g.: `LEFT JOIN table1 AS m4 ON mytable.idD=m4.K OR m4.K = 'my_value'`. – Siyual Jun 15 '16 at 20:13
  • I tried removing 'WHERE' and put 'LEFT JOIN table1 AS m4 ON mytable.idD=m4.K OR m4.K = 'my_value'' for all the 4 left join but it doesn't work...I am wrong? – emish89 Jun 15 '16 at 20:24
  • Are all tables on same database? Performance usually suffers when joining tables on different databases. – Shago Jun 15 '16 at 20:42
  • same db :( performance is variable but at least 1 second for query and on a website it is not acceptable – emish89 Jun 15 '16 at 20:44

2 Answers2

2

Consider which of these columns you can index: mytable.idA, mytable.idB, mytable.idC, mytable.idD, table1.K, table2.idT, table2.idT.

Also consider if you can create an index for table2.idK1 and table2.idK2; and another one for mytable.idK1 and mytable.idK2.

Shago
  • 605
  • 4
  • 12
  • 2
    In the long run an index for only one query is not recommended. Depending on the table size you will end up at least doubling the size of the tables (indexes occupy a lot of disk and memory if you constantly access it). When you have 6 tables the engine has to calculate a 6! decision tree and optimize it. It would help if you share the whole tables structure and a table status to see cardinality. – Maximiliano Rios Jun 15 '16 at 19:38
  • Before recommending any index changes, think them thru. It is irresponsible otherwise. – Drew Jun 15 '16 at 19:51
  • the size of the tables is 250k value for mytable and less than 50k for the others. – emish89 Jun 15 '16 at 19:53
  • @Shago how can I use and index? I don't know pros and cons – emish89 Jun 15 '16 at 20:48
  • @emish89 check out this answer http://stackoverflow.com/a/29842947/2576398 – Shago Jun 15 '16 at 20:50
0

I tried doing this and execution time are really better:

SELECT * FROM (
SELECT * FROM mytable WHERE idA = "my_value" OR idB = "my_value" OR idC = "my_value" OR idD = "my_value"
) as p1
LEFT JOIN table1 AS m1 ON p1.idA=m1.K 
LEFT JOIN table1 AS m2 ON p1.idB=m2.K 
LEFT JOIN table1 AS m3 ON p1.idC=m3.K 
LEFT JOIN table1 AS m4 ON p1.idD=m4.K 
LEFT OUTER JOIN table2 AS t1 ON p1.idK1= t1.idK1 AND p1.idK2=  t1.idK2
LEFT OUTER JOIN table3 ON t1.idT = table3.idT

now the execution time is 0.06 seconds, before it was 1.5 - 2 seconds.


Can I have caused some problems? the query seems to work well

emish89
  • 696
  • 8
  • 25