9

I am trying to load the latest 10 Arts grouped by the user_id and ordered by created_at. This works fine with SqlLite and MySQL, but gives an error on my new PostgreSQL database.

Art.all(:order => "created_at desc", :limit => 10, :group => "user_id")

ActiveRecord error:

Art Load (18.4ms)  SELECT "arts".* FROM "arts" GROUP BY user_id ORDER BY created_at desc LIMIT 10
ActiveRecord::StatementInvalid: PGError: ERROR:  column "arts.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT  "arts".* FROM "arts"  GROUP BY user_id ORDER BY crea...

Any ideas?

atmorell
  • 3,052
  • 6
  • 30
  • 44

4 Answers4

9

The sql generated by the expression is not a valid query, you are grouping by user_id and selecting lot of other fields based on that but not telling the DB how it should aggregate the other fileds. For example, if your data looks like this:

a  | b
---|---
1  | 1
1  | 2
2  | 3

Now when you ask db to group by a and also return b, it doesn't know how to aggregate values 1,2. You need to tell if it needs to select min, max, average, sum or something else. Just as I was writing the answer there have been two answers which might explain all this better.

In your use case though, I think you don't want a group by on db level. As there are only 10 arts, you can group them in your application. Don't use this method with thousands of arts though:

 arts = Art.all(:order => "created_at desc", :limit => 10)
 grouped_arts = arts.group_by {|art| art.user_id}
 # now you have a hash with following structure in grouped_arts
 # { 
 #    user_id1 => [art1, art4],
 #    user_id2 => [art3],
 #    user_id3 => [art5],
 #    ....
 # }

EDIT: Select latest_arts, but only one art per user

Just to give you the idea of sql(have not tested it as I don't have RDBMS installed on my system)

SELECT arts.* FROM arts
WHERE (arts.user_id, arts.created_at) IN 
  (SELECT user_id, MAX(created_at) FROM arts
     GROUP BY user_id
     ORDER BY MAX(created_at) DESC
     LIMIT 10)
ORDER BY created_at DESC
LIMIT 10

This solution is based on the practical assumption, that no two arts for same user can have same highest created_at, but it may well be wrong if you are importing or programitically creating bulk of arts. If assumption doesn't hold true, the sql might get more contrieved.

EDIT: Attempt to change the query to Arel:

Art.where("(arts.user_id, arts.created_at) IN 
             (SELECT user_id, MAX(created_at) FROM arts
                GROUP BY user_id
                ORDER BY MAX(created_at) DESC
                LIMIT 10)").
    order("created_at DESC").
    page(params[:page]).
    per(params[:per])
rubish
  • 10,887
  • 3
  • 43
  • 57
  • Well I have 6000 records, so this might become a performance problem. I see that you have two records for user_id1. I am trying to load the latest 10 Art's - only one pr. user. – atmorell Aug 05 '11 at 09:14
  • This will not be a performance problem, until you have a limit clause of 10, but the second issue still remains. I guess there can be raw sql to handle what you want, but converting it to Arel might be difficult. – rubish Aug 05 '11 at 09:23
  • Updated the answer, but not sure if its syntactically correct. Let me know if this works. – rubish Aug 05 '11 at 09:35
  • Your example works. I am having a lot of trouble getting this to play along with pagination. Could this be done with Active Record, so that you can call method like .page(params[:page]).per(20) etc. – atmorell Aug 07 '11 at 21:59
  • Attempted to convert it into arel syntax, but this query is difficult to convert. – rubish Aug 08 '11 at 06:08
  • You updated example works perfectly. I was wondering if it would be possible to fetch a random record from each user instead of always the latest one, but still sort the Array by the latest Arts. – atmorell Aug 08 '11 at 14:05
  • The `ORDER BY MAX(created_at) DESC` part of the query here doesn't seem to be needed, BTW. – seanlinsley Mar 18 '14 at 00:18
6

You need to select the specific columns you need

Art.select(:user_id).group(:user_id).limit(10)

It will raise error when you try to select title in the query, for example

Art.select(:user_id, :title).group(:user_id).limit(10)

column "arts.title" must appear in the GROUP BY clause or be used in an aggregate function

That is because when you try to group by user_id, the query has no idea how to handle the title in the group, because the group contains several titles.

so the exception already mention you need to appear in group by

Art.select(:user_id, :title).group(:user_id, :title).limit(10)

or be used in an aggregate function

Art.select("user_id, array_agg(title) as titles").group(:user_id).limit(10)

Community
  • 1
  • 1
Ilake Chang
  • 1,542
  • 1
  • 14
  • 19
2

Take a look at this post SQLite to Postgres (Heroku) GROUP BY

PostGres is actually following the SQL standard here whilst sqlite and mysql break from the standard.

Community
  • 1
  • 1
John Beynon
  • 37,398
  • 8
  • 88
  • 97
0

Have at look at this question - Converting MySQL select to PostgreSQL. Postgres won't allow a column to be listed in the select statement that isn't in the group by clause.

Community
  • 1
  • 1
ipr101
  • 24,096
  • 8
  • 59
  • 61