I have a table keywords
with columns keyword
and weight
. My goal is to randomly select one keyword
, but to regard its weight
(probability). I found two ways to solve this, where the latter one is more elegant (and consumes less ressources) - but i dont get it to run. See yourself.
The table and records:
CREATE TABLE IF NOT EXISTS `keywords` (
`keyword` varchar(100) COLLATE utf8_bin NOT NULL,
`weight` int(11) NOT NULL,
UNIQUE KEY `keywords` (`keyword`),
KEY `rate` (`weight`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
INSERT INTO `keywords` (`keyword`, `weight`) VALUES
('google', 50),
('microsoft', 20),
('apple', 10),
('yahoo', 5),
('bing', 5),
('xing', 5),
('cool', 5);
Query 1
Consumes more ressources, i work on 5k+ records. Source is Why would this MySQL query using rand() return no results about a third of the time?:
SELECT * FROM `keywords` ORDER BY -LOG(1.0 - RAND()) / weight LIMIT 1
Query 2
Sums up weights to @weight_sum
. Sets @weight_point
to RAND()
number from within that range. Loops through all records, substracting weight
from @weight_pos
and setting @keyword
to the current keywords.keyword
. Until @weight_pos < 0
. Then it keeps that keyword
. Source is Random Weighted Choice in T-SQL
SET @keyword = 0;
SET @weight_sum = (SELECT SUM(weight) FROM keywords);
SET @rand = RAND();
SET @weight_point = ROUND(((@weight_sum - 1) * @rand + 1), 0);
SET @weight_pos = @weight_point;
SELECT
keyword,
weight,
@keyword:=CASE
WHEN @weight_pos < 0 THEN @keyword
ELSE keyword
END AS test,
(@weight_pos:=(@weight_pos - weight)) AS curr_weight,
@weight_point,
@keyword,
@weight_pos,
@rand,
@weight_sum
FROM
keywords;
See phpmyadmin results here http://postimg.org/image/stgpd776f/
My Question
How do i get the value in @keyword
, or what the test
column holds in the end? Adding a SELECT @keyword
afterwards doesn't change anything.