PostgreSQL - GROUP BY clause

Question

I want to search by tags, and then list all articles with that tag, and also how many of given tags they match. So for example I might have:

 Page1 - 2 (has css and php tag)
 Page2 - 1 (has only css tag)

Query:

SELECT COUNT(t.tag)
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id 
JOIN w_article a ON a.id = a2t.article 
WHERE t.tag = 'css' OR t.tag = 'php'
GROUP BY t.tag
LIMIT 9

When I only put COUNT(t.tag) the query works, and I get okay results. But if I append e.g. ID of my article I get following error:

ERROR: column "a.title" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT COUNT(t.tag), a.title FROM a_tags t

How to add said columns to this query?

score 8 · Answer 1 · answered Sep 24 '13 at 20:44

8

When you use a "GROUP BY" clause, you need to enclose all columns that are not grouped in an aggregate function. Try adding title to the GROUP BY list, or selecting "min(a.title)" instead.

SELECT COUNT(t.tag), a.title FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id 
JOIN w_article a ON a.id = a2t.article 
WHERE t.tag = 'css' OR t.tag = 'php' GROUP BY t.tag, a.title LIMIT 9

answered Sep 24 '13 at 20:44

Michael Krebs

1,212
10
11

yeah I tried to add MAX(a.title) and then at the end order by tags_count DESC.. that solved it, but I'm not sure why you need to put eg. in your case MIN(a.title) .. as I transfered from MySQL, i think that's not the case there so that's why it's a bit weird :O – Adrian Sep 24 '13 at 20:52
3

MySQL's `group by` is non standard and ambiguous http://stackoverflow.com/a/1777448/375874 – Falmarri Sep 24 '13 at 21:10
1

To be explicit, mysql does not require an aggregate function (like MIN or MAX) on columns in the select list which do not appear in the group by. This behavior can lead to unexpected results in some cases, so postgres does have this requirement, which you have noticed is different from mysql. – Michael Krebs Sep 24 '13 at 22:25
1

@Adrian: you might want to read this: http://rpbouman.blogspot.de/2007/05/debunking-group-by-myths.html and this: http://www.mysqlperformanceblog.com/2006/09/06/wrong-group-by-makes-your-queries-fragile/ to understand the possible pitfalls of MySQL's "loose" (not say: incorrect) `group by` handling. – Sep 25 '13 at 00:41

Erwin Brandstetter · Accepted Answer · 2021-10-23T22:13:33.063

8

Postgres 9.1 or later, quoting the release notes of 9.1 ...

Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause (Peter Eisentraut)

The SQL standard allows this behavior, and because of the primary key, the result is unambiguous.

Return a grouped list with occurrences using Rails and PostgreSQL

The queries in the question and in @Michael's answer have the logic backwards. We want to count how many tags match per article, not how many articles have a certain tag. So we need to GROUP BY w_article.id, not by a_tags.id.

list all articles with that tag, and also how many of given tags they match

To fix this:

SELECT count(t.tag) AS ct, a.*  -- any column from table a allowed ...
FROM   a_tags         t
JOIN   w_articles2tag a2t ON a2t.tag = t.id
JOIN   w_article      a   ON a.id = a2t.article
WHERE  t.tag IN ('css', 'php')
GROUP  BY a.id                  -- ... since PK is in GROUP BY
LIMIT  9;

Assuming id is the primary key of w_article.
However, this form will be faster while doing the same:

SELECT a.*, ct
FROM  (
   SELECT a2t.article AS id, count(*) AS ct
   FROM   a_tags         t
   JOIN   w_articles2tag a2t ON a2t.tag = t.id 
   GROUP  BY 1
   LIMIT  9      -- LIMIT early - cheaper
   ) sub
JOIN   w_article a USING (id);  -- attached alias to article in the sub

Closely related answer from just yesterday:

Why does the following join increase the query time significantly?

edited Oct 23 '21 at 22:13

answered Sep 24 '13 at 22:48

Erwin Brandstetter

605,456
145
1,078
1,228

Erwin, I invite you to be more encouraging when referring to the other contributors. The question and answer were not incorrect on principal, they were merely correct based on a different principal. The question asked was one of syntax, not semantics. Adrian provided code to support the syntax question. The answer provided the same level of detail as the question, and was restricted to the question asked. In reply, Adrian commented, "that solved it." You chose to answer with more detail about GROUP BY, and to improve the semantics also. I'm all for that. The other principal can be useful, too. – Michael Krebs Sep 26 '13 at 14:55
1

@MichaelKrebs: There was nothing even remotely offending or ad hominem. Did not mean to, either. Sorry if I stepped on your toe. Removed the bold and some noise. Just pointing out what's wrong, *which is what we do here on SO*. You got two upvotes for a half correct answer. Not that bad. And welcome to Stackoverflow. – Erwin Brandstetter Sep 26 '13 at 21:30
Hello Erwin. Apologies for reviving this ancient (excellent) answer. I'm facing a problem with this feature and wonder if you may just happen to know the answer. I recently had to reinstall postgres and found out that a SQL query that uses this "group by feature" works in my old postgres, but doesn't in my new postgres. Both are the same postgres version (!). Do you know if this "feature" is maybe a postgres server ini config of some sorts that has to be enabled? Thank you. – d-ph Mar 02 '23 at 16:24
Right, classic "stackoverflow" moment: found my answer 5 minutes after asking (even though spending an hour on this problem already). Solution: my db dump import somehow did not recreate primary indexes on the db table in question, so this "group by feature" silently didn't "kick in". I need to find out why my db export didn't reinstantiate primary keys, but otherwise: creating the missing primary keys solved the problem. Thanks. – d-ph Mar 02 '23 at 16:31

PostgreSQL - GROUP BY clause

2 Answers2

Linked