1

Consider the following:

user_id weight
      1   2
      2   5

Is there a query that could select a user ID at random while using the weights to affect the odds of which ID gets chosen?

In other words, in such a scenario, is there a query that would look at this table and then choose 1 user ID at random from the following group:

[UserID_001, UserID_001, UserID_002, UserID_002, UserID_002, UserID_002,  UserID_002]

So that UserID_002 has a 5/7 chance of being selected, and UserId_001 has a 2/7 chance of being selected.

This would be like each user having a number of tickets in a raffle, User_001 has 2 tickets, and User_002 has 5 tickets. Is it possible to do such a thing with an SQL query? Or do I need to use some other method?

Thanks in advance!

Strawberry
  • 33,750
  • 13
  • 40
  • 57
Emily
  • 1,151
  • 4
  • 21
  • 42
  • 2
    it would be a lot easier to do this in the application level - otherwise, look [here](http://stackoverflow.com/questions/2417621/mysql-select-random-entry-but-weight-towards-certain-entries) – pala_ May 07 '15 at 16:03
  • Hi @pala_ thanks for your comment, I read the link you gave and I'm now testing the following query on my database: ``SELECT * FROM mytable ORDER BY (RAND( ) * weight) LIMIT 1;`` this appears to be giving a random ID each time, I was wondering if I could ask, does it look correct to you? – Emily May 07 '15 at 16:33
  • @pala_ actually, after doing some testing, it appears this query works: ``SELECT * FROM mytable ORDER BY -LOG(RAND()) / weight LIMIT 1;`` – Emily May 07 '15 at 16:53

2 Answers2

1

You want a weighted sampling. This is actually pretty easy, once you have a cumulative sum of the all the weights.

select t.*
from (select t.*,
             (@w := @w + weight) as cumew
      from table t cross join
           (select @w := 0) init
     ) t
where @w * rand() <= cumew
order by weight
limit 1;

Note in the outer query that @w is the sum of all the weights. So, it multiplies this by a random number and finds the first cumulative weight that is greater than or equal to this value.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Here is solution for your lottery :-)

If you need see how it works, just comment last line of query WHERE ...

http://sqlfiddle.com/#!9/54a44/17

SELECT *
FROM (SELECT *, 
IF(@total IS NULL,@first:=1, @first:=@total+1) from_number,
IF(@total IS NULL,@total:=tickets, @total:=@total+tickets) to_number,
IF(@winner IS NULL,@winner:= FLOOR( 1 + RAND( ) * t_sum.t_all), @winner) win 
FROM tickets
LEFT JOIN (SELECT SUM(tickets) t_all FROM tickets) t_sum
ON 1) t
WHERE win >= from_number AND win <= to_number
Alex
  • 16,739
  • 1
  • 28
  • 51