T-SQL - Query many-to-many relation table with order

Question

I have a table containing association data. It has three columns SubjectId, StudentId and Rank. It is a many to many relation table. I need to find the SubjectId corresponding to a list of given StudentIds in rank order.

If the order was unimportant, I would do something like

SELECT SubjectId
FROM mytable
WHERE StudentId IN (a,b,c) --This is the 'input'. I want order here to make a difference by matching to 'rank' but in this version it doesn't
GROUP BY SubjectId
HAVING COUNT(1) = 3

This would also return SubjectIds with ranks of students in order (a, c, b), (b, c, a), etc.

How would I do the same if the order of a, b, c is important?

Edit: included sample data and expected result

Sample data

SubjectId | StudentId | Rank
1|a|2
1|c|1
1|b|3
2|c|1
2|b|2
3|c|2
3|b|1
4|a|1
4|b|2
5|b|2
5|a|1

Expected results Input: (c,a,b) Output: 1

Input: (c,a) Output: NULL (no rows)

Input: (c,b) Output: 2

Input: (a,b) Output: 4, 5 --as separate rows

Input: (a,b,c) Output: NULL (no rows)

what's the problem with doing "SELECT SubjectId FROM myTable WHERE studentId IN (a1,b2,c3) ORDER BY Rank" — asemprini87, Apr 11 '17 at 19:41
Can you provide an example dataset and your expected result? Your question is pretty unclear. — fqhv, Apr 11 '17 at 19:42
@asemprini87 it will only look for subset, not exact association match. — Ajay Raghav, Apr 11 '17 at 19:58
First, the strings in the list of the IN clause should be quoted. Second, what to you intend by count(1)? — Sloan Thrasher, Apr 11 '17 at 20:02
What do you mean by **order of a, b, c is important**? From your example output, you expect 1 row returned, so you want to know if the rank of the students is in order (ie. if a,b,c then return the rank of a if a < b < c? — Sloan Thrasher, Apr 11 '17 at 20:07
@SloanThrasher I found that query here: http://stackoverflow.com/questions/1054299/sql-many-to-many-table-and-query I have not tested it. Count(1) here should mean count of first column. — Ajay Raghav, Apr 11 '17 at 20:07
He is concerned about the input order. he is strictly looking for a,b,c order even though c,a,b is the input. — Teja, Apr 11 '17 at 20:08
@SloanThrasher I do not expect 1 row to be returned. If there are multiple subjects for which ranks of students appeared is same, then both subjects should show up. Yes, I want to know the subjects in which students rank in the given order. — Ajay Raghav, Apr 11 '17 at 20:09
I think what @asemprini87 means might be `SELECT SubjectId from MyTable WHERE CONCAT(StudentId,Rank) in ('a1','b2','c3') GROUP BY SubjectId HAVING Count(*)=3` — Steve Lovell, Apr 11 '17 at 20:11
Show the exact response for each of the cases, including which columns you expect to see and their values, and the order you want the rows returned. — Sloan Thrasher, Apr 11 '17 at 20:12
@SloanThrasher expanded sample data and expected result to include more cases. — Ajay Raghav, Apr 11 '17 at 20:18
You still show only one row returned from the query. Also, why do the two return null? Explain your logic. Especially, what is the difference between a,b,c and c,a,b. — Sloan Thrasher, Apr 11 '17 at 20:20
That's actually not NULL, it is empty (no rows). They return no rows because there is no subject for which either ONLY (c and a) appeared and c was ranked higher than a, or ONLY (a, b and c) appeared and a > b > c. Two rows are returned in the case when ONLY (a and b) had appeared and a was ranked higher than b. Subjects 4 and 5. — Ajay Raghav, Apr 11 '17 at 20:23
Steve Lovell's comment seems accurate. The inputs are in order of rank, and the outputs are all the different Subject IDs that have those students at that rank. Looking at `Input: (a,b) Output: 4, 5` Both subjects 4 & 5 have student a at rank 1 and student b at rank 2. — fqhv, Apr 11 '17 at 20:23
@fqhv Yes. I tested it. But it is not generalized. The real data is obviously not as simple as abc, 123. Further, it has to be written in a stored procedure in my requirement that would take a comma separated 'StudentIds' in a parameter. I don't want to write dynamic sql to achieve this. — Ajay Raghav, Apr 11 '17 at 20:29
I've come up with this but it is obviously bad code (it's working though): `DECLARE @CommaSeparatedStudentIds = 'c,a,b' SELECT c.SubjectId FROM mytable c WHERE STUFF((SELECT CAST(',' AS VARCHAR(MAX)) + CAST(d.StudentId AS VARCHAR(MAX)) FROM mytable d WHERE d.SubjectId = c.SubjectId ORDER BY d.Rank FOR XML path('') ), 1, 1, '') = @CommaSeparatedStudentIds` — Ajay Raghav, Apr 11 '17 at 20:37
Others may get to providing a proper answer before me, but what would you expect for input 'c,a'? Do the listed students have to exhaust all the students associated with any subject returned? — Steve Lovell, Apr 11 '17 at 21:25
To clarify my question, if c and a rank first and second does it matter whether there is someone else who ranks third? — Steve Lovell, Apr 11 '17 at 21:29
@SteveLovell yes, there shouldn't be anyone else ranking third. The input shall contain all students who appeared in that subject. — Ajay Raghav, Apr 12 '17 at 01:55

fqhv · Accepted Answer · 2017-04-13T13:32:29.883

2

You will need to start by inserting your input into a table with the Rank and StudentId. I came up with a way you can do this without having to parse csv. It uses the CHARINDEX of the StudentIds in the input. Note that if you have an input value which is not in your table it will be ignored. Of course you could just parse the csv instead, up to you.

After you have the input in a table all you need to do is join the input to your table to get only records that match the input. Then GROUP BY SubjectId and take the SubjectIds that have the same count as the records in your input and the same count as records with that SubjectId in your table.

All together the answer will look like this. (For some reason (*) would not save, so I did (* )).

DECLARE @InputTable AS TABLE (
    [Rank] INT IDENTITY(1,1),
    StudentId VARCHAR(1) )

INSERT INTO @InputTable (StudentId)
SELECT StudentId
FROM (
    SELECT DISTINCT StudentId
    FROM MyTable --Could use your student table
    WHERE CHARINDEX(studentId + ',', @input + ',') > 0) x
ORDER BY CHARINDEX(studentId + ',', @input + ',')

SELECT SubjectId
FROM MyTable t
INNER JOIN @InputTable i ON t.[Rank] = i.[Rank] AND t.[StudentId] = i.[StudentId]
GROUP BY SubjectId
HAVING COUNT(* ) = (SELECT COUNT(* ) FROM @InputTable) 
AND COUNT(* ) = (SELECT COUNT(* ) FROM MyTable WHERE SubjectId = t.SubjectId)

Previous Answer: MySQL and not including requests from comments.

edited Apr 13 '17 at 13:32

answered Apr 11 '17 at 21:29

fqhv

1,191
1
13
25

1

If you take this route, in the `INSERT INTO` statement you should replace `MyTable` with your student table. That should also allow to remove the `DISTINCT` and therefore improve efficiency ... unless your `MyTable` for whatever reason is already filtered to contain far fewer records than your student table. – Steve Lovell Apr 12 '17 at 08:33
I would rather use a table variable than temporary table. – Ajay Raghav Apr 13 '17 at 02:31
Further, I just tested this solution and it fails for second input case (c,a). The output was 1, which is incorrect. It shouldn't have returned any rows. – Ajay Raghav Apr 13 '17 at 02:52
1

Oh, missed that. As it is the code effectively assumes you want any subject where the input matches those who rank top, but there could be extras in the subject not in the input. So effectively the other possible answer to one of my earlier queries. Ive not tested this, but i think you can fix this by adding an additional line in the having clause: `and count(*) = (select count(*) from MyTable t2 where t2.subjectid = t.subjectid)`. – Steve Lovell Apr 13 '17 at 06:00
I updated the answer to be in sql server, use a variable table, and to include the snippet @SteveLovell suggested. – fqhv Apr 13 '17 at 13:33
Please also use SPLIT_STRING. It would make the solution more clean. Plus, I don't need to validate the input as it is coming from a reliable system. I'll test the solution later today. – Ajay Raghav Apr 14 '17 at 06:11
`STRING_SPLIT` is nice, but not always available. The join to Students/MyTable is not so much for validation as a means of getting multiple records from a single row, the "validation" is a side effect. If you search for alternatives to `STRING_SPLIT` you'll see many use a "numbers table" or similar. While I'm here, there may also be mileage in replacing some of `COUNT`s with a variable to avoid the need to recalculate. I'll suggest an edit to that effect and include the `STRING_SPLIT` option. – Steve Lovell Apr 14 '17 at 10:35
@SteveLovell I saw your edit got rejected. Feel free to create a new answer with your changes and notes. – fqhv Apr 14 '17 at 17:54
Thanks @fqhv, having done that I was just now inspired by another thread to suggest a different approach, which I've now included in my answer (along with `STRING_SPLIT`). – Steve Lovell Apr 15 '17 at 11:57

score 1 · Answer 2 · edited May 23 '17 at 11:46

Since my edits to @fqhv's answer were rejected, I'm expanding my own previous answer while borrowing from his:

DECLARE @CommaSeparatedStudentIds varchar(max) = 'c,a';

DECLARE @InputTable AS TABLE (
    [Rank] INT IDENTITY(1,1),
    StudentId VARCHAR(25) --make this match your StudentID field
    )

INSERT INTO @InputTable (StudentId)
SELECT value from STRING_SPLIT(@CommaSeparatedStudentIds,',')

/* If STRING_SPLIT isn't available, you can do this other ways (see after code) */

Select
    t.SubjectId
from
    MyTable t
    FULL OUTER JOIN @InputTable i ON
        t.StudentId = i.StudentId AND
        t.Rank = i.Rank
group by
    t.SubjectId
having
    count(*) = sum(case when t.Rank = i.Rank then 1 else 0 end)

For alternatives to STRING_SPLIT see this post: Split function equivalent in T-SQL?

If we were in MySQL, a modified version of what you've put in your comments could work well:

set @CommaSeparatedStudentIds = 'a,b';

SELECT
t.SubjectId
FROM
MyTable t
group by t.SubjectId
HAVING GROUP_CONCAT(t.StudentId order by t.Rank SEPARATOR ',') = @CommaSeparatedStudentIds

Oh. I've submitted an edit to update the tags on you question to reflect that. — Steve Lovell, Apr 13 '17 at 06:01

T-SQL - Query many-to-many relation table with order

2 Answers2