2

I have the next problem in Postgres.

I have products with reviews, each review has votes.

So here are two tables, a "votes" table and a "reviews" table:

Table "votes"

 Column   |  Type   |                     KEYS                      
-----------+---------+----------------------------------------------------
 id        | integer |
 user_id   | integer | FK to users table
 review_id | integer | FK to reviews table

Table "reviews"

   Column   |            Type             |                      KEYS                       
------------+-----------------------------+--------------------------------------
 id         | integer                     |  
 product_id | integer                     |  FK to products
 user_id    | integer                     |  FK to users

I would like to get the reviews with most votes for each product. In other words, I need to get the review_id that has the maximun number of votes for each product.

Here is the query I am using:

SELECT max(votes_per_review.votes), reviews.product_id
     FROM (SELECT count(*) AS votes,review_id FROM votes GROUP BY review_id) AS votes_per_review, reviews
     WHERE votes_per_review.review_id = reviews.id
     GROUP BY reviews.product_id

I get the number of votes for the most popular review for each product and the product id, but I am not getting the review_id.

I know that you can think that this question is resolved in StackOverflow like here. However I can't resolve this problem with that post or trying to use OVER PARTITION.

Thanks and sorry in advance if the question is resolved in StackOverflow.

Community
  • 1
  • 1
Ekaitz Hernandez Troyas
  • 1,147
  • 2
  • 11
  • 18

1 Answers1

1

This should return the review with the most products using distinct on:

select distinct on (r.product) rv.*
from (select r.product, r.review_id, count(*) as numvotes
      from reviews r join
           votes v
           on r.id = v.review_id
      group by r.product, r.review_id
     ) rv
order by r.product, numvotes desc;

This can also be accomplished using window functions:

select rv.*
from (select r.product, r.review_id, count(*) as numvotes,
             row_number() over (partition by r.product order by count(*) desc) as seqnum
      from reviews r join
           votes v
           on r.id = v.review_id
      group by r.product, r.review_id
     ) rv
where seqnum = 1;

This uses ANSI standard functionality, that should be available in almost any database.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • `distinct on` with `order by` will select the first product? – Juan Carlos Oropeza Nov 13 '15 at 15:27
  • @JuanCarlosOropeza . . . This is Postgres-only functionality and quite useful. In many ways, it does what people think `select distinct` should do. The documentation is here: http://www.postgresql.org/docs/9.3/static/sql-select.html#SQL-DISTINCT. – Gordon Linoff Nov 13 '15 at 15:29
  • what a great function, some day db should join and get the best function on one single place. – Juan Carlos Oropeza Nov 13 '15 at 15:32
  • 1
    @JuanCarlosOropeza . . . Postgres comes pretty close to being that database. If only it supported hints for the optimizer . . . ;) – Gordon Linoff Nov 13 '15 at 15:36