55

I have following query in PostgreSQL:

SELECT 
    COUNT(a.log_id) AS overall_count
FROM 
    "Log" as a, 
    "License" as b 
WHERE 
    a.license_id=7 
AND 
    a.license_id=b.license_id 
AND
    b.limit_call > overall_count
GROUP BY 
    a.license_id;

Why do I get this error:

ERROR: column "overall_count" does not exist

My table structure:

License(license_id, license_name, limit_call, create_date, expire_date)
Log(log_id, license_id, log, call_date)

I want to check if a license has reached the limit for calls in a specific month.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Hadi
  • 613
  • 1
  • 6
  • 7

3 Answers3

64
SELECT a.license_id, a.limit_call
     , count(b.license_id) AS overall_count
FROM   "License"  a
LEFT   JOIN "Log" b USING (license_id)
WHERE  a.license_id = 7 
GROUP  BY a.license_id  -- , a.limit_call  -- add in old versions
HAVING a.limit_call > count(b.license_id)

Since Postgres 9.1 the primary key covers all columns of a table in the GROUP BY clause. In older versions you'd have to add a.limit_call to the GROUP BY list. The release notes for 9.1:

Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause

Further reading:

The condition you had in the WHERE clause has to move to the HAVING clause since it refers to the result of an aggregate function (after WHERE has been applied). And you cannot refer to output columns (column aliases) in the HAVING clause, where you can only reference input columns. So you have to repeat the expression. The manual:

An output column's name can be used to refer to the column's value in ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses; there you must write out the expression instead.

I reversed the order of tables in the FROM clause and cleaned up the syntax a bit to make it less confusing. USING is just a notational convenience here.

I used LEFT JOIN instead of JOIN, so you do not exclude licenses without any logs at all.

Only non-null values are counted by count(). Since you want to count related entries in table "Log" it is safer and slightly cheaper to use count(b.license_id). This column is used in the join, so we don't have to bother whether the column can be null or not.
count(*) is even shorter and slightly faster, yet. If you don't mind to get a count of 1 for 0 rows in the left table, use that.

Aside: I would advise not to use mixed case identifiers in Postgres if possible. Very error prone.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Hi Erwin, Thanks. It work for me, but i don't want return b.limit_call in the result. btw what different with 9.1 version ? – Hadi Nov 14 '11 at 09:57
  • @Hadi: you can remove any column from the SELECT list in this query, no problem. See my amended answer for what's new in 9.1 – Erwin Brandstetter Nov 14 '11 at 10:10
  • @ErwinBrandstetter, how does the "new in 9.1" work if/since "you" join the primary key with, generally, many "b" records? – epox Aug 26 '20 at 20:46
  • 1
    @epox: Once the PK (implicitly `UNIQUE NOT NULL`) is in the `GROUP BY` list, adding any other column of the same table to the same `GROUP BY` list won't change the result. (Those are "functionally dependent".) That's true all the same after joining to many rows. – Erwin Brandstetter Aug 27 '20 at 03:45
  • I see, So `a` groups are UNIQUE (1 row) groups, and any `a`s column can be used in HAVING since 9.1. But if "you" needed `b` column in HAVING, you would not be able to use it without COUNT(b.x) or including into GROUP BY. Thanks – epox Aug 27 '20 at 10:14
  • @epox: Yes - only implemented for the PK, though. See: https://stackoverflow.com/q/62095987/939860 – Erwin Brandstetter Aug 27 '20 at 11:45
12

Pure conditional count(*):

  SELECT COUNT(*) FILTER(where a.myfield > 0) AS my_count
    FROM "Log" as a 

GROUP BY a.license_id

so you:

  • get 0 for groups where the condition never meets
  • can add as many count(*) columns as you need

Filter-out the groups, having condition mismatch:

NOTE: you cannot use HAVING b.limit_call > ..., unless you group by limit_call. But you can use an agregate function to map many "limit_calls" in the group into a single value. For example, in your case, you can use MAX:

  SELECT COUNT(a.log_id) AS overall_count
    FROM "Log" as a 
    JOIN "License" b ON(a.license_id=b.license_id)

GROUP BY a.license_id
  HAVING MAX(b.limit_call) > COUNT(a.log_id)

And don't care about duplicating COUNT(a.log_id) expression in the first and in the last lines. Postgres will optimize it.

epox
  • 9,236
  • 1
  • 55
  • 38
10

The where query doesn't recognize your column alias, and furthermore, you're trying to filter out rows after aggregation. Try:

SELECT 
COUNT(a.log_id) AS overall_count
FROM 
"Log" as a, 
"License" as b 
WHERE 
a.license_id=7 
AND 
a.license_id=b.license_id 
GROUP BY 
a.license_id
having b.limit_call > count(a.log_id);

The having clause is similar to the where clause, except that it deals with columns after an aggregation, whereas the where clause works on columns before an aggregation.

Also, is there a reason why your table names are enclosed in double quotes?

  • i got error "column "b.limit_call" must appear in the GROUP BY clause or be used in an aggregate function" when use having – Hadi Nov 14 '11 at 09:34
  • Then you'll need to group by that column, as well. Since I don't know the structure of your data, you'll have to figure out the appropriate list of columns for the `group by` clause. –  Nov 14 '11 at 09:36