1

I've provided statements to create the two example tables for my question below.

In this example, the second table contains the same student id multiple times due to having multiple classIDs. However, I need the query results to be displayed horizontally on one row. (The max # of classID's per studentid is 15)

How would you write a select statement that joins the two tables and all classID data repeats within the same column like the example below:

CREATE TABLE Student (
 StudentId int,
 FirstName VarChar (255),
 LastName VarChar (255)
);

CREATE TABLE Classes (
 StudentId int,
 ClassId int,
 ClassName VarChar (255),
 ClassCost int
);

INSERT INTO Student (StudentId, FirstName, LastName)
VALUES 
(123, 'Carol', 'Dwek'),
(456, 'Cal', 'Newport');

INSERT INTO Classes (StudentId, ClassId, ClassName,ClassCost)
VALUES 
(123, 972, 'Psychology',30),
(456, 214, 'Focus',99),
(123, 903, 'Sociology',30),
(456, 851, 'Meditation',99),
(456, 911, 'Reading',20),
(456, 111, 'Deep Work',50),
(456, 117, 'Time Management',25),
(456, 999, 'Goal Setting',50);

Results: enter image description here

June7
  • 19,874
  • 8
  • 24
  • 34
madQuestions
  • 109
  • 1
  • 7
  • Add tag for RDBMS used. Solution depends on that. Data should be posted as formatted text table, not image. Search topic of 'concatenate related records'. – June7 Feb 17 '23 at 21:09
  • Hello, I've added more details. Are you saying to use concatenate as opposed to a pivot solution? – madQuestions Feb 17 '23 at 21:21
  • Depends on what you want to do with the output. Make attempt and edit your question with the SQL statement and issues encountered. I expect pivot will be more difficult in your case because there is no field to serve as source for column headers, it would have to calculated, probably with ROW_NUMBER function. Also, you want pivot on two data points - ClassID and ClassName. – June7 Feb 17 '23 at 21:34
  • The output needs to look like the example. Also, adding my failed attempts doesn't help as I'm not familiar with the Pivot process and that is why I'm asking for assistance. This keeps the question area cleaner. Thanks. – madQuestions Feb 17 '23 at 21:46
  • The SQL language has a very strict rule: it **MUST** know about **ALL** the columns in the results at query compile time, before looking at **ANY** table data. If you don't know this, you will NOT be able to pivot the data in a single query. Instead, you'll need to to this in three steps: `1)` Run a query to find out what your columns are. `2)` Build a new dynamic query based on results from 1. `3)` Run the query from 2. – Joel Coehoorn Feb 17 '23 at 21:56
  • `[cont]` Now, if you have a maximum number of repeated columns you can get around this by JOINing back to the target table(s) multiple times: once for each set of repeated columns. But this tends to be _stupid slow_. It's also antithetical to the relational set theory underpinning modern DBs, such that you're usually MUCH better off doing this work in the client code or reporting tool. – Joel Coehoorn Feb 17 '23 at 21:59
  • What are you using for GUI? Access pivot does not have this limitation. However, this can make it difficult to build stable report based on pivot (aka CROSSTAB). Another difficulty in Access SQL is calculating group row number because it does not support ROW_NUMBER function. https://stackoverflow.com/questions/64400911/pivot-query-in-ms-access – June7 Feb 17 '23 at 23:08
  • If you use Access as GUI, can link to SQLServer table, build a Pass-Through query to construct dataset with Row_Number function then build CROSSTAB in regular Access SQL with the PTQ as source, although still have complication of two pivot points. Access has ways to handle that but not simply. – June7 Feb 17 '23 at 23:38
  • Review: https://stackoverflow.com/questions/7822004/pivots-with-dynamic-columns-in-sql-server and https://stackoverflow.com/questions/70804530/sql-pivot-table-with-multiple-pivots-on-criteria – June7 Feb 17 '23 at 23:52

2 Answers2

2

This is only possible in a single query if you know in advance how many potential classes a student might possibly have. If you don't know and can't guess a reasonable maximum, the SQL language simply will NOT be able to produce the desired output in a single statement.

If you can guess the maximum course load, the query looks like this:

WITH NumberedClasses As (
   SELECT *, row_number() over (partition by StudentID order by ClassID) rn
   FROM Classes
)
SELECT s.*
    ,c1.ClassId, c1.ClassName, c1.ClassCost
    ,c2.ClassID, c2.ClassName, c2.ClassCost
    -- ...
    ,cn.ClassID, cn.ClassName, cn.ClassCost
FROM Student s
LEFT JOIN NumberedClasses c1 ON c1.StudentID = s.StudentID AND c1.rn = 1
LEFT JOIN NumberedClasses c2 ON c2.StudentID = s.StudentID And c2.rn = 2
-- ...
LEFT JOIN NumberedClasses cn ON cn.StudentID = s.StudentID And cn.rn = {n}

Note: this tends to be SLOW — and not just a little slow; we're talking several minutes to finish (or longer) if you have a reasonable amount of data. And, yes, you really do have to repeat yourself in two places for as many times as you have possible class enrollments.

It's also worth mentioning here this kind of PIVOT is antithetical to the formal set theory which underpins all relational databases. For this reason, you're usually MUCH better off doing this work in the client code or reporting tool.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
2

If you are happy to use a hardcoded ceiling on number of classes a student can attend then a way that should perform better than multiple self joins (which will likely re-evaluate the row numbering multiple times) is to use the numbering to pivot on instead.

The same ordering that provides the row numbering (StudentId, ClassId) can also be used to do the grouping by StudentId (provided by primary key (StudentId, ClassId) in the plan below)

enter image description here

The query is still pretty ugly though and best done in the application (if there is an application and you aren't just running adhoc queries in SSMS to view the results there)

With Numbered As
(
SELECT *,
       rn = row_number() over (PARTITION BY StudentID ORDER BY ClassID) 
FROM Classes
), Pivoted As
(
SELECT StudentId, 
       ClassId1 = MAX(CASE WHEN rn = 1 THEN ClassId END), 
       ClassName1 = MAX(CASE WHEN rn = 1 THEN ClassName END), 
       ClassCost1 = MAX(CASE WHEN rn = 1 THEN ClassCost END),
       ClassId2 = MAX(CASE WHEN rn = 2 THEN ClassId END), 
       ClassName2 = MAX(CASE WHEN rn = 2 THEN ClassName END), 
       ClassCost2 = MAX(CASE WHEN rn = 2 THEN ClassCost END),
       ClassId3 = MAX(CASE WHEN rn = 3 THEN ClassId END), 
       ClassName3 = MAX(CASE WHEN rn = 3 THEN ClassName END), 
       ClassCost3 = MAX(CASE WHEN rn = 3 THEN ClassCost END),
       ClassId4 = MAX(CASE WHEN rn = 4 THEN ClassId END), 
       ClassName4 = MAX(CASE WHEN rn = 4 THEN ClassName END), 
       ClassCost4 = MAX(CASE WHEN rn = 4 THEN ClassCost END),
       ClassId5 = MAX(CASE WHEN rn = 5 THEN ClassId END), 
       ClassName5 = MAX(CASE WHEN rn = 5 THEN ClassName END), 
       ClassCost5 = MAX(CASE WHEN rn = 5 THEN ClassCost END),
       ClassId6 = MAX(CASE WHEN rn = 6 THEN ClassId END), 
       ClassName6 = MAX(CASE WHEN rn = 6 THEN ClassName END), 
       ClassCost6 = MAX(CASE WHEN rn = 6 THEN ClassCost END)
FROM Numbered
GROUP BY StudentId
)
SELECT S.FirstName, S.LastName, P.*
FROM   Student S
       JOIN Pivoted P
         ON P.StudentId = S.StudentId 
Martin Smith
  • 438,706
  • 87
  • 741
  • 845