Counts from another table don't exist yet to filter in WHERE

Question

I'm getting the following error:

[42703] ERROR: column "nonprimarycount" does not exist

With this query:

SELECT
  d.id,
  COUNT(1) FILTER (WHERE s.is_primary = True) AS primarycount,
  COUNT(1) FILTER (WHERE s.is_primary = False) AS nonprimarycount
FROM Dino d
LEFT JOIN Senator s ON d.id = s.id
WHERE nonprimarycount < 10
GROUP BY d.id
ORDER BY nonprimarycount DESC, 
            primarycount DESC;

It's clear to me that I can't do this because the nonprimarycount I made doesn't exist yet, but it's not clear to me how to do this correctly. I'd like to select only records with fewer than 10 nonprimary counts, and I need to construct that count in the query.

Unrelated, but: if you assumed that `count(1)´ is faster than `count(*)` you were wrong. — , Feb 01 '21 at 22:28
Is it /slower/ or just no difference and a silly way to write that? — Mittenchops, Feb 02 '21 at 00:10
Oh, I see this is contentious https://stackoverflow.com/questions/1221559/count-vs-count1-sql-server — Mittenchops, Feb 02 '21 at 00:11
https://blog.jooq.org/2019/09/19/whats-faster-count-or-count1/ — , Feb 02 '21 at 06:25

Erwin Brandstetter · Answer 1 · 2021-02-01T22:00:26.987

You error is WHERE instead of HAVING, like Gordon pointed out.

But I suggest to rewrite your query like this:

SELECT id
     , COALESCE(s.primarycount   , 0) AS    primarycount
     , COALESCE(s.nonprimarycount, 0) AS nonprimarycount
FROM   dino d
LEFT   JOIN (
   SELECT id
        , count(*) FILTER (WHERE is_primary) AS primarycount
        , count(*) FILTER (WHERE is_primary = false) AS nonprimarycount
   FROM   senator
   GROUP  BY id
   ) s USING (id)
WHERE  (s.nonprimarycount > 9) IS NOT TRUE
ORDER  BY nonprimarycount DESC, primarycount DESC;

Should be faster.

It's typically (much) faster to aggregate before the join.
Since we pushed the aggregation down into a subquery, we can now use WHERE in the outer SELECT instead of HAVING, removing your cause of error. Same performance, but simpler.
count(*) is faster and equivalent for the case.
is_primary = true is a noisy way of saying is_primary.
WHERE (s.nonprimarycount > 9) IS NOT TRUE is the same as WHERE (s.nonprimarycount < 10 OR s.nonprimarycount IS NULL).
We can now get NULL for "no row in table senator". I default to 0 like your original does. If you want to display the difference instead, remove COALESCE and use ORDER BY ... DESC NULLS LAST to sort NULL values last.
Notably, the (unqualified!) output column name will work in ORDER BY, even though it's the same as the input column name in the outer SELECT because, quoting the manual:

If an ORDER BY expression is a simple name that matches both an output column name and an input column name, ORDER BY will interpret it as the output column name. This is the opposite of the choice that GROUP BY will make in the same situation. This inconsistency is made to be compatible with the SQL standard.

If we can assume that every dino has at least one row in senator, we can simplify:

SELECT id
     , s.primarycount
     , s.nonprimarycount
FROM   dino d
JOIN  (
   SELECT id
        , count(*) FILTER (WHERE is_primary) AS primarycount
        , count(*) FILTER (WHERE is_primary = false) AS nonprimarycount
   FROM   senator
   GROUP  BY id
   ) s USING (id)
WHERE  s.nonprimarycount < 10
ORDER  BY s.nonprimarycount DESC, s.primarycount DESC;

Now, the counts cannot be NULL.

score 1 · Accepted Answer · answered Feb 01 '21 at 19:15

Use a having clause:

SELECT d.id,
       COUNT(1) FILTER (s.is_primary) AS primarycount,
       COUNT(1) FILTER (NOT s.is_primary) AS nonprimarycount
FROM Dino d LEFT JOIN
     Senator s ON d.id = s.id
GROUP BY d.id
HAVING COUNT(1) FILTER (NOT s.is_primary) < 10 
ORDER BY nonprimarycount desc, 
            primarycount desc;

I think Postgres requires that you repeat the expression, rather than using the column alias.

Counts from another table don't exist yet to filter in WHERE

2 Answers2