0

I have two columns in a table:

rating_total integer
rating_count integer

rating_total is a sum of total rates, and rating count is a number of total rating. I want to order by average rate:

ORDER BY (rating_total/rating_count) DESC, id ASC

Should I create separate column for average_rate? Or it is possible to create some index to speed this up?


I've managed to create index like this:

CREATE INDEX "posts-avg-rating-index" 
ON "Posts" (("rating_total"::float/(CASE "rating_count" WHEN 0 THEN NULL ELSE "rating_count" END)) DESC NULLS LAST, id DESC);'

Which works for ordering like this:

ORDER BY (("rating_total"::float/(CASE "rating_count" WHEN 0 THEN NULL ELSE "rating_count" END)) DESC NULLS LAST, id DESC)
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
user606521
  • 14,486
  • 30
  • 113
  • 204
  • Why not just `avg(rating)` (assuming you have a rating column)? –  Feb 23 '15 at 09:22
  • 1
    you should use separate `column` or take a [look](http://stackoverflow.com/a/4068083/2749470) on this answer – Bhargav Modi Feb 23 '15 at 09:22
  • @JackManey I am not performing any grouping, each row has already calculated rating_total & rating_count. – user606521 Feb 23 '15 at 09:25
  • @BhargavModi so it is not possible to set up index for such query? – user606521 Feb 23 '15 at 09:26
  • @user606521 take look on my previous edited comment – Bhargav Modi Feb 23 '15 at 09:27
  • 5
    Did you try to create an index on `( (rating_total/rating_count), id )`? You should also read this: http://stackoverflow.com/tags/postgresql-performance/info and add the missing information. –  Feb 23 '15 at 09:28
  • @a_horse_with_no_name Will such index work with `rating_total / (CASE rating_count WHEN 0 THEN NULL ELSE rating_count END) DESC NULLS LAST, id ASC`? – user606521 Feb 23 '15 at 09:34
  • 1
    most likely no, you probably should use same expression at index definition as you have in order by clause then it will work for sure. – alexius Feb 23 '15 at 11:42
  • there is not much you can do here apart from materializing the query or parts of it. you cannot create an index on the order clause as it is a dynamically generated value. – Hans-Jürgen Schönig Feb 23 '15 at 12:27
  • Well, it seems to be working for me (see edit) - I can see "Index Scan" in EXPLAIN ANALYZE when using proper ordering. – user606521 Feb 23 '15 at 14:27
  • 2
    You should really avoid quoted identifiers. They are much more trouble in the long run than they are worth it. –  Feb 23 '15 at 14:39
  • @a_horse_with_no_name Hmm, most of my column names are camel-cased so they have to be quoted. Do you suggest to use underscored column names instead? To be honest I am using postgres for fetching data returned by REST API in JSON format, where camel-casing attributes in objects is a common convention. – user606521 Feb 23 '15 at 15:10
  • 2
    Column names in PostgreSQL should always be lowercased with underscores used as word/token separators. Anything else is just asking for pain. And just because your data source is JSON from a REST API, that doesn't mean that the column names in your database have to follow the same naming conventions as the keys in the JSON responses. –  Feb 23 '15 at 15:23
  • As almost always, a table definition (`\d tbl` in psql) would have clarified a lot to start with. – Erwin Brandstetter Feb 23 '15 at 16:25

1 Answers1

2

Your question created some confusion, since an "average rating" would normally be a value calculated across multiple rows, which cannot be indexed.

In your case you obviously want to order by the quotient of two columns in the same rows, which happen to be a "total" and a "count". For this case you can have an expression index, of course - you found one yourself already.

You can simplify with NULLIF:

CREATE INDEX posts_avg_rating_index
ON "Posts" (cast(rating_total AS float)/NULLIF(rating_count, 0) DESC NULLS LAST, id DESC);

Use the same expression in your query.

About cast() shorthand syntax name::type and expression indexes:

And rather use legal names. A dash (-) in identifiers is legal while double-quoted, but just asking for trouble.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Is there a compelling reason to use a expression index over a materialized view and index or a [trigger updated] computed value column? It seems like the expression index would need to store the result internally anyway.. – user2864740 Feb 23 '15 at 16:31
  • 1
    @user2864740: It's a good question. Both solutions have their own costs, benefits and side effects. Please ask a new *question* for this. Comments are not the place. You can always link to this question for context. – Erwin Brandstetter Feb 23 '15 at 16:53