0

I have a query which works fine on SQLite, but when I run it on the same data in Postgresql I get:

column "role.id" must appear in the GROUP BY clause or be used in an aggregate function

I have three tables, for people, exhibitions, and a table that links the two: "One person in one exhibition performing a particular role" (such as "Artist" or "Curator"):

CREATE TABLE "person" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
                       "name" varchar(255));

CREATE TABLE "exhibition" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
                           "name" varchar(255));

CREATE TABLE "role" (`id` integer NOT NULL PRIMARY KEY AUTOINCREMENT,
                     `name` varchar(30) NOT NULL,
                     `exhibition_id` integer NOT NULL,
                     `person_id` integer NOT NULL, 
                     FOREIGN KEY(`exhibition_id`) REFERENCES `exhibition`(`id`),
                     FOREIGN KEY(`person_id`) REFERENCES `person`(`id`));

I want to display the people involved in an exhibition ordered by how many things they've done. So, I get the IDs of the people in an exhibition (1,2,3,4) and then do this:

SELECT
    *,
    COUNT(person.id) AS role_count
FROM person
INNER JOIN role
ON person.id = role.person_id
WHERE person.id IN ( 1, 2, 3, 4 )
GROUP BY person.id
ORDER BY role_count DESC

That orders the people by role_count, which is the number of roles they've had across all exhibitions

It works fine on SQLite, but not in Postgresql. I've tried putting role.id into the GROUP BY (instead of, and as well as, person.id) but that changes the results.

Phil Gyford
  • 13,432
  • 14
  • 81
  • 143
  • It's actually the other way round: SQLite is broken. This kind of `group by` is invalid and wrong. –  Nov 30 '16 at 19:58

2 Answers2

1

You know when you struggle for ages, post an SO question, and then immediately stumble on the answer?

From this answer I realised that I couldn't select role.id (which the SELECT * is implicitly doing) as it wasn't in the GROUP BY.

I couldn't add it to the GROUP BY (because that changes the results) so the solution was to not select it.

So I changed the SELECT part to:

SELECT
    person.*,
    COUNT(person.id) AS role_count
FROM person
...

Now role.id is not being selected. And that works.

If I needed any other fields from the role table, like name, I could add those explicitly too:

SELECT
    person.*,
    role.name,
    COUNT(person.id) AS role_count
FROM person
...
Community
  • 1
  • 1
Phil Gyford
  • 13,432
  • 14
  • 81
  • 143
1

Just like the error says, Standard SQL doesn't let you SELECT anything other than one of the GROUP BY columns or a call to an aggregate function. (For a logical reason: How would the RDBMS know which role.id to select when there are multiple rows to select from within a group?) PostgreSQL actually enforces this rule; SQLite ignores it and just returns data from an arbitrary row in the group.

As you discovered, omitting role.id from the SELECT fixes your error. But if you do want SQLite's behavior of selecting the ID from an arbitrary row, you can simply wrap it in an aggregate function, e.g., SELECT MAX(role.id) instead of just SELECT role.id).

dan04
  • 87,747
  • 23
  • 163
  • 198