0

this is my sql problem - there are 3 tables:

Names         Lists                ListHasNames
Id Name       Id Desc              ListsId  NamesId
=--------     ------------         ----------------
1  Paul       1  Football          1        1
2  Joe        2  Basketball        1        2
3  Jenny      3  Ping Pong         2        1
4  Tina       4  Breakfast Club    2        3
              5  Midnight Club     3        2
                                   3        3
                                   4        1
                                   4        2
                                   4        3
                                   5        1
                                   5        2
                                   5        3
                                   5        4

Which means that Paul (Id=1) and Joe (Id=2) are in the Football team (Lists.Id=1), Paul and Jenny in the Basketball team, etc...

Now I need a SQL statement which returns the Lists.Id of a specific Name combination: In which lists are Paul, Joe and Jenny the only members of that list ? Answer only Lists.Id=4 (Breakfast Club) - but not 5 (Midnight Club) because Tina is in that list, too.

I've tried it with INNER JOINS and SUB QUERIES:

SELECT Q1.Lists_id FROM

(
SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Paul') and
  (T1.Id=T2.Names_ID) and
   ( (
     SELECT count(*) FROM
      listhasnames as Z1
     where (Z1.lists_id = T2.lists_Id)
    ) = 3)

) AS Q1

INNER JOIN (


SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Joe') and
  (T1.Id=T2.Names_ID) and
  (
    (SELECT count(*) FROM
      listhasnames as Z1
     WHERE (Z1.Lists_id = T2.lists_id)
    ) = 3)

) AS Q2

ON (Q1.Lists_id=Q2.Lists_id)



INNER JOIN (


SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Jenny') and
  (T1.Id=T2.Names_ID) and
  (
    (SELECT count(*) FROM
      listhasnames as Z1
     WHERE (Z1.Lists_id = T2.lists_id)
    ) = 3)

) AS Q3

ON (Q1.Lists_id=Q3.Lists_id)

Looks a little bit complicated, uh? How to optimize that? I need only that Lists.Id in which specific names are in (and only these names and nobody else). Maybe with SELECT IN?

Regards, Dennis

  • (This is a comment, not an answer.) I am kind of curious. Why use plurals for the table names? (We normally use singular names, to name one row of the table.) If you're going to go with plurals, why isn't the third table named (more appropriately) ListsHaveNames ? (I might also suggest that the third table be named Membership or ListMembership.) And Carl Manaster's answer returns the result set you specified. – spencer7593 May 27 '09 at 20:17
  • Plurals or not: http://stackoverflow.com/questions/808992/singular-or-plural-database-table-names/809034 the third table ist called ListHasNames because MySQL Workbench assigned that name by default if you choose a n:m relation. This was just an example.. –  Jun 04 '09 at 11:13

4 Answers4

2
SELECT ListsId
FROM ListHasNames a
WHERE NamesId in (1, 2, 3)
AND NOT EXISTS
(SELECT * from ListHasNames b 
WHERE b.ListsId = a.ListsId 
AND b.NamesId not in (1, 2, 3))
GROUP BY ListsId
HAVING COUNT(*) = 3;

Edit: Corrected thanks to Chris Gow's comment; the subselect is necessary to exclude lists that have other people on them. Edit 2 Corrected the table name thanks to Dennis' comment

Carl Manaster
  • 39,912
  • 17
  • 102
  • 155
  • 1
    This doesn't seem to work in postgreSQL. I tried it and I get both 4 and 5 back. The OP only wants list ids returned that contain only those three people and no one else – Chris Gow May 27 '09 at 20:38
  • You're right - it wouldn't work in any DBMS; I've added a subselect to correct the problem. Thanks. – Carl Manaster May 27 '09 at 21:13
  • 1
    (SELECT * from ListsId b should read (SELECT * from ListHasNames b but i got it anyway! ;-) –  May 28 '09 at 08:10
2

Using Carl Manaster's solution as a starting point I came up with:

SELECT listsid 
FROM listhasnames 
GROUP BY listsid HAVING COUNT(*) = 3
INTERSECT
SELECT x.listsid 
FROM listhasnames x, names n 
WHERE n.name IN('Paul', 'Joe', 'Jenny') 
AND n.id = x.namesid
Chris Gow
  • 7,514
  • 4
  • 23
  • 18
  • That's another nice way to exclude the lists with extra individuals. – Carl Manaster May 27 '09 at 21:14
  • Never heard of INTERSECT, looks like its new to MSSQL2005. However, doesn't this query actually return incorrect results? The first query returns all lists having 3 names and the second query returns all lists that contain either Paul, Joe, or Jenny. If there was a list that contained a total of 3 members but only 1 or 2 of those members is Paul, Joe, or Jenny, it would be returned by this query which is incorrect according to the original question. – David Archer Jun 04 '09 at 00:38
1

Updated:

select a.ListsId from
(
    --lists with three names only
    select lhn.ListsId, count(*) as count
    from ListHasNames  lhn
    inner join Names n on lhn.NamesId = n.Id 
    group by lhn.ListsId
    having count(*) = 3
) a
where a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Paul'))
and a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Joe'))
and a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Jenny'))
D'Arcy Rittich
  • 167,292
  • 40
  • 290
  • 283
  • Maybe you should tell him that Count = 3 is because of his request with 3 names – Scoregraphic May 27 '09 at 18:04
  • Since the list of names is fixed and "Count" will always be 3, I would remove the last two columns from the select list and do "HAVING COUNT(*) = 3" instead. – Tomalak May 27 '09 at 18:20
0

I was just solving a problem recently that may work well for your case as well. It may be overkill.

I took the approach of creating a list of candidate associations that may be the correct solution, and then using a cursor or queue table to go through the likely correct solutions to do full validation.

In my case this was implemented by doing like

select
ParentId
count(*) as ChildCount
checksum_agg(checksum(child.*) as ChildAggCrc
from parent join child on parent.parentId = child.parentId

Then you can compare the count and aggregate checksum against your lookup data (i.e. your 3 names to check for). If no rows match, you are guaranteed to have no matches. If any row matches you can then go through and do a join of that specific ParentId to validate if there are any discrepancies between the row sets.

Clear as mud? :)

ahains
  • 1,912
  • 12
  • 10