The underlying problem (joining to multiple tables with multiple matches) is explained in this related answer:
To fix, I first simplified & formatted your query:
select pe.year
, sum(pe.wins) AS wins
, sum(pe.losses) AS losses
, sum(pe.ties) AS ties
, array_agg(po.id) AS position_id
, array_agg(po.name) AS position_names
from periods_positions_coaches_linking pp
join positions po ON po.id = pp.position
join periods pe ON pe.id = pp.period
where pp.coach = 1
group by pe.year
order by pe.year;
Yields the same, incorrect result as your original, but simpler / faster / easier to read.
No point in joining the table coach
as long as you don't use columns in the SELECT
list. I removed it completely and replaced the WHERE
condition with where pp.coach = 1
.
You don't need COALESCE
. NULL
values are ignored in the aggregate function sum()
. No need to substitute 0
.
Use table aliases to make it easier to read.
Next, I solved your problem like this:
SELECT *
FROM (
SELECT pe.year
, array_agg(DISTINCT po.id) AS position_id
, array_agg(DISTINCT po.name) AS position_names
FROM periods_positions_coaches_linking pp
JOIN positions po ON po.id = pp.position
JOIN periods pe ON pe.id = pp.period
WHERE pp.coach = 1
GROUP BY pe.year
) po
LEFT JOIN (
SELECT pe.year
, sum(pe.wins) AS wins
, sum(pe.losses) AS losses
, sum(pe.ties) AS ties
FROM (
SELECT period
FROM periods_positions_coaches_linking
WHERE coach = 1
GROUP BY period
) pp
JOIN periods pe ON pe.id = pp.period
GROUP BY pe.year
) pe USING (year)
ORDER BY year;
Aggregate positions and periods separately before joining them.
In the first sub-query po
list positions only once with array_agg(DISTINCT ...)
.
In the second sub-query pe
...
GROUP BY period
, because a coach can have multiple positions per period.
JOIN
to periods-data after that, and then aggregate to get sums.
db<>fiddle here
Old sqlfiddle