3

Suppose you had two tables. One called MOVIES:

  • MovieId
  • MovieName

Then another called ACTORS that contains people who appear in those movies:

  • MovieId
  • ActorName

Now, I want to write a query that returns any movie that contains ONE OR MORE of the following actors: "Tom Hanks", "Russell Crowe" or "Arnold Schwarzenegger".

One way to do it would be something like:

SELECT DISTINCT A.MovieId, M.MovieName FROM ACTORS A
INNER JOIN MOVIES M USING (MovieId)
WHERE A.ActorName IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger');

Which is perfectly fine, however in my case I might have several more of these conditions on the WHERE clause so I want to find a way to make the MOVIES table the primary table I select from.

What's the best way to query for this? I'm using Oracle 11g if that matters, but I'm hoping for a standard SQL method.

Mike Christensen
  • 88,082
  • 50
  • 208
  • 326

2 Answers2

13

You can use EXISTS or IN subqueries:

SELECT *
FROM MOVIES m
WHERE EXISTS
(
    SELECT *
    FROM ACTORS a
    WHERE a.MovieId = m.MovieId
    AND a.ActorName IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger')
)

or

SELECT *
FROM MOVIES m
WHERE m.MovieId IN 
(
    SELECT a.MovieId
    FROM ACTORS a
    WHERE a.ActorName IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger')
)
Bogdan Sahlean
  • 19,233
  • 3
  • 42
  • 57
2

First you should have a 3rd table implementing the n:m relationship:

CREATE TABLE movie (
 movie_id int primary key
,moviename text
-- more fields
);

CREATE TABLE actor (
 actor_id int primary key
,actorname text
-- more fields
);

CREATE TABLE movieactor (
 movie_id int references movie(movie_id)
,actor_id int references actor(actor_id)
,CONSTRAINT movieactor_pkey PRIMARY KEY (movie_id, actor_id)
);

Then you select like this:

SELECT DISTINCT m.movie_id, m.moviename
  FROM movie m 
  JOIN movieactor ma USING (movie_id)
  JOIN actor a USING (actor_id)
 WHERE a.actorname IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger');

Note, that text literals are enclose in single quotes!

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    `DISTINCT` is costly. It's better to avoid `DISTINCT` if it's possible. – Bogdan Sahlean Sep 26 '11 at 21:20
  • Here you can chose between `DISTINCT`, `GROUP BY` and `WHERE m.MovieId IN (subselect)`. Run a test to see which is faster. I just ran a test on PostgreSQL v9.0 (not Oracle!) with a couple of thousand rows and all three variants took about the same time. It very much depends on the details. Maybe Bogdan speaks from experience with SQL-Server? Each RDBMS performs differently. – Erwin Brandstetter Oct 02 '11 at 19:38
  • Each RDBMS is different. I agree. Some RDBMS can have a better QueryOptimizer than others. Or not. See item 4 from [here](http://stackoverflow.com/questions/621884/database-development-mistakes-made-by-application-developers/621891#621891). – Bogdan Sahlean Oct 18 '11 at 07:16