1

In the most general of cases, I have a query like below:

SELECT tutor_school.name, count(*), tutor_school.schoolid 

FROM tutor_school, tutor_attends, tutor_tutors_in 

WHERE 
tutor_school.schoolid = tutor_attends.schoolid and 
tutor_school.schoolid in ('1', '2', '3') and 
tutor_attends.userid=tutor_tutors_in.userid

group by tutor_school.schoolid LIMIT 0, 10

In essence, I want:

Name of School, Number of Students attending that school that tutor in any subject, Schoolid

What I'm actually getting is

Name of School, sum of all subjects taught by students at that school, schoolid -- in other words, if student 1 tutors 3 subjects, student 2 tutors 5, then instead of returning 2 I get 8!

I've realized that the issue is with the following statement:

tutor_attends.userid=tutor_tutors_in.userid

This isn't checking the existence of a given foreign key in a remote table, it's giving a result for each instance of that key.

What I'm trying to figure out is how to bind it to limit it to simply the existence of the given key, not the number of times that key occurs. I know I've seen a case similar to this in my SQL class, but I can't remember what the solution was.

onedaywhen
  • 55,269
  • 12
  • 100
  • 138
RonLugge
  • 5,086
  • 5
  • 33
  • 61

3 Answers3

2

Andrew is correct and here's a concrete example of his solution (had to make assumptions on your table design):

SQL> SELECT * FROM tutor_school;

  SCHOOLID NAME
---------- --------------------
         1 School A
         2 School B
         3 School C


SQL> SELECT * FROM tutor_attends;

    USERID   SCHOOLID
---------- ----------
         1          1
         2          1
         3          2


SQL> SELECT * FROM tutor_tutors_in;

    USERID SUBJECT
---------- --------------------
         1 Math
         1 Science
         1 English
         2 English
         3 Math


SQL> SELECT tutor_school.name, COUNT(DISTINCT tutor_tutors_in.userid)
  2  FROM tutor_school, tutor_attends, tutor_tutors_in
  3  WHERE tutor_school.schoolid=tutor_attends.schoolid
  4  AND tutor_attends.userid=tutor_tutors_in.userid
  5  GROUP BY tutor_school.name
  6  /

NAME                 COUNT(DISTINCTTUTOR_TUTORS_IN.USERID)
-------------------- -------------------------------------
School A                                                 2
School B                                                 1
roartechs
  • 1,315
  • 1
  • 12
  • 15
  • No downvote, but you're missing a column in the output the asker wants (SchoolID). "I want: Name of School, Number of Students attending that school that tutor in any subject, Schoolid" – Ken White Oct 04 '11 at 23:59
  • He might be missing the column, but that's easy enough to add. It's a good answer, IMO. I wasn't interested in a fixed query, I was interested in somebody pointing me in the direction of the syntax feature I was forgetting. – RonLugge Oct 05 '11 at 22:41
1

First, you should learn the ANSI syntax for JOINs. Using just the WHERE clause is deprecated. Leaving that aside, I can think of a solution without even using EXISTS.

SELECT tutor_school.name, count(DISTINCT tutor_attends.userid), tutor_school.schoolid 

FROM tutor_school, tutor_attends, tutor_tutors_in 

WHERE 
tutor_school.schoolid = tutor_attends.schoolid and 
tutor_school.schoolid in ('1', '2', '3') and 
tutor_attends.userid=tutor_tutors_in.userid

group by tutor_school.schoolid LIMIT 0, 10

Where I am a little confused about your schema is that I see join tables, but no plain-old tutor table. It might be easier to conceptualize the query with one.

Andrew Lazarus
  • 18,205
  • 3
  • 35
  • 53
  • Using just the WHERE clause is not deprecated. Some people prefer old style joins, deal with it. – onedaywhen Oct 05 '11 at 08:15
  • @onedaywhen: First, the syntax for old-style OUTER JOINs is clumsy, if you use the WHERE clause, because you have to account for NULLs to keep the same semantics as an ON clause. That syntax is officially deprecated for SQL Server [http://msdn.microsoft.com/en-us/library/ms143729.aspx] and was never supported on many other systems. The old-style joins put joining and filtering clauses in the same place, not IMHO optimal. Some people write and even champion inferior code style. Deal with it. – Andrew Lazarus Oct 05 '11 at 19:47
  • As for my schema, the reason you don't see a plain-old tutor table is because none exists. There is a user table (tutor_user), but not all users are tutors. The be a tutor you have to have selected subjects to tutor (tutor_tutors_in), and to use the website you're required to have selected the school you're attending (tutor_attends). Tutor_ is a prefix that shouldn't have been included in the question. As for using joins, the last several occasions I needed a join, I needed the unsupported full join -- which leaves me understandably prejudiced, since they've left a bad taste in my mouth. – RonLugge Oct 05 '11 at 22:37
  • 1
    Outer join syntax using the where clause was never part of the SQL Standard (regardless of whether you consider it to be American or International). This question only has the 'sql' tag. The OP is not using an outer join. The join syntax that the OP is using has not been deprecated in the Standard nor is there any (public) plans to deprecate it in SQL Server. Statements such as "inferior code style" are subjective. The OP has used a style that isn't to your taste (nor mine, as it happens). SQL Server old style outer joins aren't really relevant here, IMO. – onedaywhen Oct 06 '11 at 07:19
0

The relational operator you require is semijoin (rather than join). From your natural language statement of the requirement, I think using EXISTS would be most appropriate

e.g.

SELECT tutor_school.name, count(*), tutor_school.schoolid 
FROM tutor_school, tutor_tutors_in 
WHERE 
tutor_school.schoolid = tutor_attends.schoolid and 
tutor_school.schoolid in ('1', '2', '3') and
exists (
        SELECT *
          FROM tutor_attends
         WHERE tutor_attends.userid = tutor_tutors_in.userid
       )
group by tutor_school.schoolid LIMIT 0, 10
Community
  • 1
  • 1
onedaywhen
  • 55,269
  • 12
  • 100
  • 138