0

I have tables called Movie, Genre and Keyword, from which I have created a view called 'genkeyword'. The view 'genkeyword' has a lot of tuples, so it can be accessed at DB Fiddle.

The following query calculates the frequency of genres that the movies in genkeyword have in common with Harry Potter and the frequency of keywords that the movies in genkeyword have in common with Harry Potter. It then combines the results of the two queries using a full outer join.

SELECT
    *
FROM
    (
        SELECT
            title,
            year,
            count(distinct genre) as genre_freq
        FROM
            genkeyword
        where
            (
                genre in (
                    select
                        genre
                    from
                        genkeyword
                    where
                        title = 'Harry Potter and the 
Deathly Hallows'
                )
            )
            AND title <> 'Harry Potter and the Deathly Hallows'
        group by
            title,
            year
    ) a 
    FULL OUTER JOIN (
        select
            title,
            year,
            count(distinct keyword) as keyword_freq
        FROM
            genkeyword
        where
            keyword in (
                select
                    keyword
                from
                    genkeyword
                where
                    title = 'Harry Potter and 
the Deathly Hallows'
            )
            and title <> 'Harry Potter and the Deathly Hallows'
        group by
            title,
            year
    ) b ON b.title = a.title;

The output of the above query is the following:

title               |   year    |   genre_freq   |    title    |    year    | keyword_freq

Cinderella              2015         2              Cinderella      2015        2   

Enchanted               2007         1              Enchanted       2007        3

How to train your dragon  2010       2               null           null       null
The Shape of Water        2017       2        The Shape of Water    2017       1       

I know that for two result sets A and B, a full outer join outputs the rows that match in A and B, as well as the rows in A that don't have matching rows in B and the rows in B that don't have matching rows in A.

However, in the third row of the output (where the movie is 'How to train your dragon'), why are there null values for the title and year attribute? I know that the for the keyword_freq the value would be null because the movie does not have any keywords in common with Harry Potter, but wouldn't the title and year attributes have the values 'How to train your dragon' and 2010 respectively?

Any insights are appreciated.

Michael Buen
  • 38,643
  • 9
  • 94
  • 118
ceno980
  • 2,003
  • 1
  • 19
  • 37
  • Please read the duplicate link which explains the various types of joins using smaller, easier to digest, data sets. Then, apply that learning to your current question. – Tim Biegeleisen Apr 20 '19 at 09:37
  • [Joins](https://external-preview.redd.it/M5QHWsp2vgZ-3QDZ4m-qS58lsOUgDNHau8trSFzS8H0.jpg?width=966&auto=webp&s=8b65d4a6bade772e7632325cfd47dbe1e083960f) – Georgy Apr 20 '19 at 12:23

0 Answers0