0

In my database users can add a vote from 1 to 5 stars to every groups. Then I have to display a leaderboard by those votes. What I was doing until now is to order them by votes average without a weight. This is not so nice because a group having 5.0 with 20 votes is before of a group having 4.9 avg and 10000 votes.

This is my votes table:

CREATE TABLE IF NOT EXISTS votes(
user_id BIGINT,
group_id BIGINT,
vote SMALLINT,
vote_date timestamp,
PRIMARY KEY (user_id, group_id)

This is how I sort them now:

SELECT 
      group_id,
      COUNT(vote) AS amount,
      ROUND(AVG(vote), 1) AS average, 
      RANK() OVER(PARTITION BY s.lang ORDER BY ROUND(AVG(VOTE), 1)DESC, COUNT(VOTE)DESC)
FROM votes
LEFT OUTER JOIN supergroups AS s 
USING (group_id)
GROUP BY group_id, s.lang, s.banned_until, s.bot_inside
HAVING 
      (s.banned_until IS NULL OR s.banned_until < now()) 
      AND COUNT(vote) >= %s 
      AND s.bot_inside IS TRUE

How could I add a sort of weight to solve the problem I said before?

I read about bayesan approach here but I am not sure if it's the right thing because I read it's about to sort the top 'n' elements, while I have to do a leaderboard including anyone of them.

double-beep
  • 5,031
  • 17
  • 33
  • 41
91DarioDev
  • 1,612
  • 1
  • 14
  • 31
  • This post is going to be opinion based I'm guessing. There are lots of options and each has its merits. You could do a weighting system, you could not display it in ranks until is has 20/50/100 votes, etc. For example, I went to Amazon and searched Computer and the 5th result has under 20 reviews but has 5 stars. – dfundako Mar 29 '18 at 17:20
  • @dfundako in fact as you can read in the query i used "AND COUNT(vote) >= %s" to avoid groups with few votes but i think i need something to weight them. i read about bayesan approach and wondering if can be good in my case – 91DarioDev Mar 29 '18 at 17:23
  • Ok, then set it up, dump a bunch of test data in there, and see if you like the ordering that results from it. – dfundako Mar 29 '18 at 17:33
  • Why not just order by COUNT(VOTE)DESC instead? Then u will have the group with 4.9 come first. – Daniel Marcus Mar 29 '18 at 19:32
  • @DanielMarcus i have to take into consideration also the average otherwise a group with 2.0 avg and 1000 will be on top and that's bad – 91DarioDev Mar 29 '18 at 19:39
  • ok so maybe you can assign a weight based on the number of digits in the votecount? so a votecount with 2 digits is worth .5 and a votecount with 3 digits is worth .7 and a vote count with 4 is worth .9 something like that – Daniel Marcus Mar 29 '18 at 19:44

1 Answers1

0

you're going to have to fudge it somehow, perhaps this way.

order by (0.0+sum(vote))/(count(vote)+log(count(vote)))

Or sqrt might work better than log, it depends how much weight you want the population size to have.

order by (0.0+sum(vote))/(count(vote)+sqrt(count(vote)))

basically the fudge needs to be a function that increases at a slower rate than it input. you could even try a constant.

Jasen
  • 11,837
  • 2
  • 30
  • 48