2

a seemingly generic SQL query really left me clueless. Here's the case. I have 3 generic tables (simplified versions here):

Movie

id | title
-----------------------
1  | Evil Dead
-----------------------
2  | Bohemian Rhapsody
....

Genre

id | title
-----------------------
1  | Horror
-----------------------
2  | Comedy
....


Rating

id | title
-----------------------
1  | PG-13
-----------------------
2  | R
....

And 2 many-to-many tables to connect them:

Movie_Genre

movie_id | genre_id


Movie_Rating

movie_id | rating_id

The initial challenge was to write a query which allows me to fetch movies that belong to multiple genres (e.g. horror comedies or sci-fi action).

Thankfully, I was able to find this solution here MySQL: Select records where joined table matches ALL values

However, what would be the correct option to fetch records that belong to multiple many-to-many tables? E.g. rated R horror comedies. Is there any way to do so without subquery (or a single one only)?

GMB
  • 216,147
  • 25
  • 84
  • 135
Denis
  • 322
  • 1
  • 4
  • 15

2 Answers2

2

One method uses correlated subqueries:

select m.*
from movies m
where (select count(*)
       from movie_genre mg
       where mg.movie_id = m.id
      ) > 1 and
      (select count(*)
       from movie_rating mr
       where mr.movie_id = m.id
      ) > 1 ;

With indexes on movie_genre(movie_id) and movie_rating(movie_id) this probably has quite reasonable performance.

The above is possibly the most efficient method. However, if you wanted to avoid subqueries, one method would be:

select mg.movie_id
from movie_genres mg join
     movie_ratings mr
     on mg.movie_id = mr.movie_id
group by mg.movie_id
having count(distinct mg.genre_id) > 0 and
       count(distinct mr.genre_id) > 0;

More efficient than the above is aggregating before the join:

select mg.movie_id
from (select movie_id
      from mg_genres
      group by movie_id
      having count(*) >= 2
     ) mg join
     (select movie_id
      from mg_ratings
      group by movie_id
      having count(*) >= 2
     ) mr
     on mg.movie_id = mr.movie_id;

Although you state that you want to avoid subqueries, the irony is that the version with no subqueries probably has the worst performance of these three options.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
2

E.g. rated R horror comedies

You can join all the tables together, aggregate by movie and filter with a HAVING clause:

select m.id, m.title
from movies m
inner join movie_genre  mg on mg.movid_id = m.id
inner join genre g on g.id = mg.genre_id
inner join movie_rating mr on mr.movie_id = m.id
inner join rating r on r.id = mr.rating_id
group by m.id, m.title
having 
    max(r.title = 'R') = 1 
    and max(g.title = 'Horror') = 1
    and max(g.title = 'Comedy') = 1

You can also use a couple of exists conditions along with correlated subqueries:

select m.*
from movie m
where
    exists (
        select 1 
        from movie_genre mg
        inner join genre g on g.id = mg.genre_id
        where mg.movie_id = m.id and g.title = 'R')
    and exists (
        select 1
        from movie_rating mr
        inner join rating r on r.id = mr.rating_id
        where mr.movie_id = m.id and r.title = 'Horror'
    )
    and exists (
        select 1
        from movie_rating mr
        inner join rating r on r.id = mr.rating_id
        where mr.movie_id = m.id and r.title = 'Comedy'
    )
GMB
  • 216,147
  • 25
  • 84
  • 135