2

I've got three tables: users, courses, and grades, the latter of which joins users and courses with some metadata like the user's score for the course. I've created a SQLFiddle, though the site doesn't appear to be working at the moment. The schema looks like this:

CREATE TABLE users(
  id INT,
  name VARCHAR,
  PRIMARY KEY (ID)
);
INSERT INTO users VALUES
  (1, 'Beth'),
  (2, 'Alice'),
  (3, 'Charles'),
  (4, 'Dave');

CREATE TABLE courses(
  id INT,
  title VARCHAR,
  PRIMARY KEY (ID)
);
INSERT INTO courses VALUES
  (1, 'Biology'),
  (2, 'Algebra'),
  (3, 'Chemistry'),
  (4, 'Data Science');

CREATE TABLE grades(
  id INT,
  user_id INT,
  course_id INT,
  score INT,
  PRIMARY KEY (ID)
);
INSERT INTO grades VALUES
  (1, 2, 2, 89),
  (2, 2, 1, 92),
  (3, 1, 1, 93),
  (4, 1, 3, 88);

I'd like to know how (if possible) to construct a query which specifies some users.id values (1, 2, 3) and courses.id values (1, 2, 3) and returns those users' grades.score values for those courses

| name    | Algebra | Biology | Chemistry |
|---------|---------|---------|-----------|
| Alice   |      89 |      92 |           |
| Beth    |         |      93 |        88 | 
| Charles |         |         |           |

In my application logic, I'll be receiving an array of user_ids and course_ids, so the query needs to select those users and courses dynamically by primary key. (The actual data set contains millions of users and tens of thousands of courses—the examples above are just a sample to work with.)

Ideally, the query would:

  • use the course titles as dynamic attributes/column headers for the users' score data
  • sort the row and column headers alphabetically
  • include empty/NULL cells if the user-course pair has no grades relationship

I suspect I may need some combination of JOINs and Postgresql's crosstab, but I can't quite wrap my head around it.

Update: learning that the terminology for this is "dynamic pivot", I found this SO answer which appears to be trying to solve a related problem in Postgres with crosstab()

Community
  • 1
  • 1
Steve Grossi
  • 2,765
  • 1
  • 22
  • 26

1 Answers1

0

I think a simple pivot query should work here, since you only have 4 courses in your data set to pivot.

SELECT t1.name,
       MAX(CASE WHEN t3.title = 'Biology'      THEN t2.score ELSE NULL END) AS Biology,
       MAX(CASE WHEN t3.title = 'Algebra'      THEN t2.score ELSE NULL END) AS Algebra,
       MAX(CASE WHEN t3.title = 'Chemistry'    THEN t2.score ELSE NULL END) AS Chemistry,
       MAX(CASE WHEN t3.title = 'Data Science' THEN t2.score ELSE NULL END) AS Data_Science
FROM users t1
LEFT JOIN grades t2
    ON t1.id = t2.user_id
LEFT JOIN courses t3
    ON t2.course_id = t3.id
GROUP BY t1.name

Follow the link below for a running demo. I used MySQL because, as you have noticed, SQLFiddle seems to be perpetually busted the other databases.

SQLFiddle

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • I'm sorry, the 4 users and courses was just a sample to work with. I neglected to specify that my actual data contains tens of thousands courses and millions of users (though I expect to `SELECT` only a few dozen courses and thousands of users at a time), so `SELECT`ing data for each course by title within the query is not an option—it will need to be dynamic. I've now tried to clarify that in the question body. – Steve Grossi Sep 22 '16 at 15:50
  • 1
    @SteveGrossi Then you will need to use dynamic SQL. Try Googling for "dynamic sql pivot" – Tim Biegeleisen Sep 22 '16 at 15:50
  • That terminology is helpful in searching for solutions, thanks! – Steve Grossi Sep 22 '16 at 16:07