Using Postgres 9.6+, suppose I have a simple survey model with three tables: questions
, users
, and answers
. Each answer refers to a question and contains text written by a user.
Here's a simplified example:
questions
|----|------------------------------|
| id | text |
|----|------------------------------|
| 1 | Do you like apples? |
| 2 | What is your favorite color? |
| 3 | How are you feeling today? |
-------------------------------------
users
|----|-------|
| id | name |
|----|-------|
| 1 | Alice |
| 2 | Bob |
--------------
answers
|----|---------|-------------|---------|
| id | user_id | question_id | text |
|----|---------------------------------|
| 1 | 1 | 1 | Yes |
| 2 | 2 | 1 | No |
| 3 | 1 | 2 | Yellow |
| 4 | 2 | 2 | Blue |
| 5 | 1 | 3 | Good |
| 6 | 2 | 3 | Fine |
----------------------------------------
My end goal is to be able to feed this into an app that will render a datatable in which each row contains a single user's info alongside their answers to a dynamic set of questions. In other words, it might want to retrieve users' answers for questions 1 and 3 specifically:
|----|-------|---------------------|----------------------------|
| id | name | Do you like apples? | How are you feeling today? |
|----|-------|---------------------|----------------------------|
| 1 | Alice | Yes | Good |
| 2 | Bob | No | Fine |
-----------------------------------------------------------------
Or any arbitrary combination of questions. Realistically these lists of question_ids might be capped at 10 items or so.
At first, I naively tried to dynamically construct the query JOINing on answers
once for each question:
SELECT
u.name name,
a1.text q1,
a2.text q2
FROM
users u
JOIN answers a1
ON a1.user_id = u.id AND question_id = 1
JOIN answers a2
ON a1.user_id = u.id AND question_id = 3
But as the number of JOINs grows, more nested loops seem to be made and performance started taking an increasingly high hit.
An alternative that came up was to try a subquery...
SELECT
u.name name,
sub.qid question_id,
sub.text text
FROM
users u
JOIN (
SELECT
a.user_id uid,
a.question_id qid,
a.text text
FROM
answers a
WHERE
a.question_id IN (1, 3)
GROUP BY
a.user_id
) sub ON sub.uid = u.id
...but since this produces one row for each answer instead of for each user, I ended up resorting to tricks such as aggregating the answers in a JSONB object to keep the format as desired, replacing the subquery with:
SELECT
a.user_id uid,
jsonb_object_agg(a.question_id, a.text) question_answers
FROM
answers a
WHERE
a.question_id IN (1, 3)
GROUP BY
a.user_id
Which works alright for retrieving the whole dataset, but becomes a pain when retrieving ranges for datatable specific operations... In particular pagination and/or sorting by one of the answers' columns (e.g. sort results by answers of "How are you feeling today?" in descending order).
I feel like I might be missing something here or falling into an antipattern, so I decided to reach for help.
What's the best / most sane way to achieve this kind of result? Thanks a bunch!!