1

I have a simple query that uses a subquery:

SELECT pictures.*
FROM pictures
WHERE pictures.user_id IN
    (SELECT follows.following_id
     FROM follows
     WHERE follows.follower_id = 9)
ORDER BY created_at DESC LIMIT 5;

I am wondering, a) How can I remove the sub query and use JOINS instead and b) will there be a performance benefit in using JOINS instead of sub query?

(follows.following_id, follows.follower_id, pictures.user_id are all indexed)

Thanks

0xSina
  • 20,973
  • 34
  • 136
  • 253
  • 2
    Why do you want to remove the sub-select? A query using a join returns something completely different than a sub-select and you will need to use e.g. `distinct` as in JW's answer to reduce the result of a JOIN to the result from the query with a sub-select. I doubt that this will be faster. –  Jun 01 '13 at 20:56
  • [The sub-select makes clear what you like to express](http://stackoverflow.com/a/2577188/237483). To answer b), you have to test both versions with your dataset. – Christian Ammer Jun 01 '13 at 21:13
  • 1
    @a_horse_with_no_name i used INNER JOIN from jw's answers. It gives me result in 0.5 - 1ms, while sub-query gives me results between 4.5 - 7.5ms. – 0xSina Jun 02 '13 at 08:06
  • 1
    @0xSina: the time is irrelevant if you don't get the correct results. I just wanted to point out that you need to be careful. You cannot always replace a sub-query with a join. –  Jun 02 '13 at 08:18
  • @a_horse_with_no_name ah thanks for pointing that out then! i need to learn more about joins. Can you tell me why distinct is needed? Is it because there might be multiple rows returned that are the exact same with a joins? – 0xSina Jun 02 '13 at 10:00
  • @0xSina: exactly. The join will return multiple rows depending on the number of rows in the joined table. The sub-query will never return more rows than present in the "base" table. Remove the distinct and you'll see the difference (btw: this re-writing for performance is not necessary in other DBMS. MySQL's query optimizer is known to be horrible when dealing with sub-selects) –  Jun 02 '13 at 10:06

1 Answers1

4
SELECT  DISTINCT pictures.*
FROM    pictures
        INNER JOIN follows
            ON pictures.user_ID = follows.following_id
WHERE   follows.follower_id = 9
ORDER   BY pictures.created_at DESC 
LIMIT   5

To further gain more knowledge about joins, kindly visit the link below:

UPDATE

Another way to achieve the same result is by using EXISTS

SELECT  *
FROM    pictures
WHERE   EXISTS
        (
            SELECT  1
            FROM    follows
            WHERE   pictures.user_ID = follows.following_id AND
                    follows.follower_id = 9
        )
ORDER   BY pictures.created_at DESC 
LIMIT   5
John Woo
  • 258,903
  • 69
  • 498
  • 492
  • Thanks! Why INNER JOIN? Why not OUTER LEFT JOIN? – 0xSina Jun 01 '13 at 17:50
  • 1
    `LEFT JOIN` will return all records from table `picture` all non matching records will have `null` values on the column on table `follows`. But since you have filtered based on the columns in table `follows`, the result will be the same in `INNER JOIN` because it will return only records that matches with the condition. `NULL` records are dropped. Try executing `LEFT JOIN` and you will have the same result. – John Woo Jun 01 '13 at 17:54