-1

I have the following tables inside of a Postgresql database:

ER diagram

The shifts table contains "Morning Shift" and "Evening Shift" as possible types. Each shift can have multiple users and users can have multiple shifts.

I need a list of all shifts, grouped by their dates, and for every shift containing all users attending the shift. So something like this:

{
...
  '2023-01-01': {
    'morning_shift': [
      { 'username': 'Some user', 'role': 'Some role', 'tag': 'Some tag' },
      ...
    ],
    'evening_shift': [
      { 'username': 'Some user', 'role': 'Some role', 'tag': 'Some tag' },
      ...
    ]   
  }
...
}

So far I have come up with the following query:

SELECT shifts.date,
  array(
    SELECT users.username
    FROM shifts_users
    JOIN shifts on shifts_users.shift_id = shifts.id
    JOIN users on users.id = shifts_users.user_id
    WHERE shifts.type = 'Morning Shift') as morning_shift,
  array(
    SELECT users.username
    FROM shifts_users
    JOIN shifts on shifts_users.shift_id = shifts.id
    JOIN users on users.id = shifts_users.user_id
    WHERE shifts.type = 'Evening Shift') as evening_shift
FROM shifts
GROUP BY shifts.date
ORDER BY shifts.date ASC

With that I receive the following result:

{
...
  '2023-01-01': {
    'morning_shift': [
      'Some user',
      ...
    ],
    'evening_shift': [
      'Some user',
      ...
    ]   
  }
...
}

Now I ran into the problem, that I only can get one column with subqueries, which means I currently only get the usernames but not the other values I need.

Is there maybe a different approach I should use or a different table setup to make life easier?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Feliboy
  • 27
  • 9

1 Answers1

2

Your query is more broken than you realize. Both subqueries are uncorrelated to the outer SELECT and gather all users for all days for the given shift, not just the ones for the outer day. Using the same table name shifts without alias in the subquery hides the outer table of the same name.

After untangling that mess, you'll find that you have to aggregate in two steps. Your original query could work like this:

SELECT date
     , min(shift_users) FILTER (WHERE type = 'Morning Shift') AS morning_shift
     , min(shift_users) FILTER (WHERE type = 'Evening Shift') AS evening_shift
FROM  (
   SELECT s.date, s.type
        , ARRAY (
            SELECT u.username
            FROM   shifts_users su         
            JOIN   users        u ON u.id = su.user_id
            WHERE  su.shift_id = s.id
            ) AS shift_users
   FROM   shifts s
   ) sub
GROUP  BY date
ORDER  BY date;

In the outer SELECT I apply a "poor-man's crosstab" technique. See:

If you want whole rows from table users instead of just the username, you could use ARRAY (SELECT u FROM .... But an array of records is hard to read / process.
You may want a JSON representation instead. But there is no min(json) aggregate function (nor min(jsonb) or similar). There are various ways around this. I would rewrite the query as:

WITH su AS (
   SELECT su.shift_id, array_agg(u.username) AS shift_users
   FROM   shifts_users su         
   JOIN   users        u ON u.id = su.user_id
   GROUP  BY su.shift_id
   )
SELECT d.date, m.shift_users AS morning_shift, e.shift_users AS evening_shift
FROM  (
   SELECT date
        , min(id) FILTER (WHERE type = 'Morning Shift') AS m_shift_id
        , min(id) FILTER (WHERE type = 'Evening Shift') AS e_shift_id
   FROM   shifts s
   GROUP  BY date
   ) d
LEFT   JOIN su m ON m.shift_id = d.m_shift_id
LEFT   JOIN su e ON e.shift_id = d.e_shift_id
ORDER  BY d.date;

Aggregate days (d) and shift-users (su) separately - the latter in a CTE, and then join to it twice. Should also be substantially faster while aggregating the whole table.

Now, to get whole user rows:

WITH su AS (
   SELECT su.shift_id, array_agg(u) AS shift_users  -- Postgres array of records
   FROM   ...  -- rest is identical

u being the reference to the whale table row (table alias in this case).
And since we got rid of aggregation in the outer SELECT, we can switch to JSON at will:

WITH su AS (
   SELECT su.shift_id, json_agg(u) AS shift_users  -- JSON array of objects
   FROM   ...  -- rest is identical
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • This works perfectly, thank you so much! I just had to switch `shift_id` to `id` in the second `FROM` query. I learned a lot from your answer! – Feliboy Aug 20 '23 at 09:02