0

Scenario: Charlie, Amy and Robert have each filled out a survey which asks the user to select all the sports they are interested in. The question is multiple choice. In the table surveyData each row represents one of the answers each user chose. so Charlie(0001) chose basketball, rugby, acrobatics and darts.

I would like to select from the table all surveyDataIds (users) who did not choose exactly basketball (0002) and rugby (0003). I believe what I'm trying to do here is perform a NAND type operation.

Desired result: When querying this table I would expect to return the following surveyDataIds: 0002 and 0004. surveyDataIds would need to be grouped as to not have duplicates. Robert is not returned as he selected basketball (0002).

Here is what I have tried so far, taking advice from the answer in this post SELECT WHERE multiple records don't exist. Unfortunately, it does not work and is returning the wrong results.

select *
FROM surveyData sd
WHERE NOT EXISTS (
    SELECT 1
    FROM surveyData sd2
    WHERE sd.surveyDataId = sd2.surveyDataId AND sd.chosenInterests in (2, 3)
)

0001 = Charlie
0002 = Amy
0003 = Robert
0004 = Lauren

interest options
0 = tennis
1 = football  
2 = basketball
3 = rugby  
4 = snooker  
5 = acrobatics 
6 = bowling 
7 = squash  
8 = cricket 
9 = darts   
10 = javelin

Table name: surveyData

surveyDataId chosenInterests
0001 2
0001 3
0001 5
0001 9
0002 6
0002 7
0002 9
0002 1
0002 4
0002 8
0003 2
0003 7
0004 10
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
daniel blythe
  • 946
  • 2
  • 16
  • 44
  • I almost all cases, NOT EXISTS wants a correlated subquery. – jarlh Dec 22 '21 at 11:54
  • I should state that I am an SQL novice. Could you please expand on a correlated subquery? EDIT: Ok I have added `sd.surveyDataId = sd2.surveyDataId AND` to the query, I think this makes it correlated? However the query still does not work as expected. – daniel blythe Dec 22 '21 at 12:00
  • Your expected results don't match your description. None of your IDs have chosen *exactly* 2 and 3 so all of your IDs have not chosen 2 and 3; do you mean *at least* ? – Stu Dec 22 '21 at 12:10
  • Hi Stu, no not at least. I mean I want people who did not choose both 2 and 3. If they selected one of them I do not want them either. So that's why Robert isn't included as he chose 2. So it has to be both id are not selected. Thanks. – daniel blythe Dec 22 '21 at 12:18
  • You are missing distinct, no? Otherwise please explain what does it mean 'doesn't work'? – Cetin Basoz Dec 22 '21 at 12:38
  • Sorry, what I found was that my query was still picking up `chosenInterests` outside what was in the `in (2, 3)`. – daniel blythe Dec 22 '21 at 12:43
  • @danielblythe, yes, I too missed you typed sd. alias in there. BTW with EXISTS queries, you don't need to invent something like "select 1", "select *" is just fine. It is a boolean check, not something to return column values. – Cetin Basoz Dec 22 '21 at 13:01

2 Answers2

1

Use NOT IN:

select distinct surveyDataId
FROM surveyData
WHERE surveyDataId NOT IN (
    SELECT surveyDataId
    FROM surveyData
    WHERE chosenInterests in (2,3)
);

BTW your query with EXISTS would work too:

select distinct surveyDataId
FROM surveyData sd1
WHERE NOT EXISTS (
    SELECT *
    FROM surveyData sd2
    WHERE sd1.surveyDataId = sd2.surveyDataId and chosenInterests in (2,3)
);
Cetin Basoz
  • 22,495
  • 3
  • 31
  • 39
1

I believe you just had the wrong alias used:

select distinct surveydataid
from surveydata sd
where not exists (
    select *
    from surveydata sd2
    where sd2.surveydataid=sd.surveydataid 
      and sd2.choseninterests in (2,3)
)
Stu
  • 30,392
  • 6
  • 14
  • 33
  • He was only missing "distinct surveyDataId" I think, but didn't say what it means "doesn't work and returning the wrong results". – Cetin Basoz Dec 22 '21 at 12:37
  • No the example query refers to the outer table not the correlated table like it should do. – Stu Dec 22 '21 at 12:39
  • Oh you are right, I didn't see he had sd. there. Without an alias it would work just fine. – Cetin Basoz Dec 22 '21 at 12:59
  • thank you for your help Stu, I managed to get NOT IN to work in the end. – daniel blythe Dec 22 '21 at 13:15
  • You should use the *not exists* approach, it's almost always the better method and can sometimes perform better, depending on your RDBMS, bot no worse. – Stu Dec 22 '21 at 14:15
  • @CetinBasoz *not* using an alias, although works by default in this context, is often a cause of confusion and the source of a number of Stack Overflow questions itself; aliases should always be used, regardless. – Stu Dec 22 '21 at 14:18
  • @Stu, you may use it, no harm. It is not a confusion to who knows SQL. – Cetin Basoz Dec 22 '21 at 16:00
  • @CetinBasoz I wouldn't agree - anyone can easily overlook a missing alias - like you did on this very simple query here! I would maintain it's always good practice to alias everything :) – Stu Dec 22 '21 at 16:54
  • @Stu, of course we could agree to disagree. Aliasing is not a must everywhere and those who know SQL know it. Here it worked against to OP writing a wrong alias while trying to write an alias, if he left that out, it would be working. – Cetin Basoz Dec 22 '21 at 17:56