0

I have a question about the difference between != and NOT IN in MySQL environment. The original question is as following:

Table: Friendship

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user1_id      | int     |
| user2_id      | int     |
+---------------+---------+

(user1_id, user2_id) is the primary key for this table. Each row of this table indicates that there is a friendship relation between user1_id and user2_id.

Table: Likes

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| user_id     | int     |
| page_id     | int     |
+-------------+---------+

(user_id, page_id) is the primary key for this table. Each row of this table indicates that user_id likes page_id.

Write an SQL query to recommend pages to the user with user_id = 1 using the pages that your friends liked. It should not recommend pages you already liked.

Return result table in any order without duplicates.

The query result format is in the following example:

Friendship table:

+----------+----------+
| user1_id | user2_id |
+----------+----------+
| 1        | 2        |
| 1        | 3        |
| 1        | 4        |
| 2        | 3        |
| 2        | 4        |
| 2        | 5        |
| 6        | 1        |
+----------+----------+

Likes table:

+---------+---------+
| user_id | page_id |
+---------+---------+
| 1       | 88      |
| 2       | 23      |
| 3       | 24      |
| 4       | 56      |
| 5       | 11      |
| 6       | 33      |
| 2       | 77      |
| 3       | 77      |
| 6       | 88      |
+---------+---------+

Result table:

+------------------+
| recommended_page |
+------------------+
| 23               |
| 24               |
| 56               |
| 33               |
| 77               |
+------------------+

User one is friend with users 2, 3, 4 and 6. Suggested pages are 23 from user 2, 24 from user 3, 56 from user 3 and 33 from user 6. Page 77 is suggested from both user 2 and user 3. Page 88 is not suggested because user 1 already likes it.

And my approach is:

# Write your MySQL query statement below
select distinct
page_id as 'recommended_page'
from likes 
where user_id in (
    (select 
    user2_id as user_id 
    from friendship 
    where user1_id = 1) 
    union 
    (select 
    user1_id as user_id 
    from friendship 
    where user2_id = 1) 
) and page_id <> (
    select 
    page_id 
    from likes 
    where user_id = 1
)

But I will receive NULL as the result for the following testing case:

{"headers":{"Friendship":["user1_id","user2_id"],
"Likes":["user_id","page_id"]},
"rows":{"Friendship":[[1,3],[1,5],[1,6],[2,3],[3,5],[3,9],[4,6],[5,9],[8,9]],
"Likes":[[6,13],[8,10],[9,14]]}}

If I switch to IN clause, I can obtain the correct results. I am curious about the difference between these two approaches.

Thank you for your help.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
Qiang Super
  • 323
  • 1
  • 11

2 Answers2

0

we can not pass the multiple values in != for ex:

The following script gets all the applications whose application_id is not eeff906835c9bd8f431c33c9b3f5ec6d.

  select * from application where application_id  !='eeff906835c9bd8f431c33c9b3f5ec6d';

NOT IN we can pass the multiple values during filter for ex:

select * from application where application_id  not in ( 'eeff906835c9bd8f431c33c9b3f5ec6d','196ec1876359b2bf0640918648c3c8355');
Vipin Pandey
  • 659
  • 8
  • 17
0

You can try the below - wherein I am just rearranging the column a little bit before performing the actual logic

with crt as(
select
case when
id_2 = id_col then id_q
else id_2 end as id_q_final, id_col
from
(select *,
case when
id_q > id_2 then id_q
else
id_2
end as id_col
from friendship) T)
select distinct(page_id) from
likes
inner join
crt on 
crt.id_col = likes.id
where crt.id_q_final = 1 and page_id not in (select page_id from likes where id=1);

The test Case will result 13

dsk
  • 1,863
  • 2
  • 10
  • 13
  • Thank you @dsk. I would like to check if your code will still perform well once you replace the 'not in' with '!='. – Qiang Super Jul 14 '20 at 18:21
  • Few reference for you https://stackoverflow.com/questions/33825917/which-operator-is-faster-or and https://stackoverflow.com/questions/43817212/what-is-the-difference-between-not-and-operators-in-sql – dsk Jul 15 '20 at 07:17