0

I have a table keywords with columns keyword and weight. My goal is to randomly select one keyword, but to regard its weight (probability). I found two ways to solve this, where the latter one is more elegant (and consumes less ressources) - but i dont get it to run. See yourself.

The table and records:

CREATE TABLE IF NOT EXISTS `keywords` (
  `keyword` varchar(100) COLLATE utf8_bin NOT NULL,
  `weight` int(11) NOT NULL,
  UNIQUE KEY `keywords` (`keyword`),
  KEY `rate` (`weight`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

INSERT INTO `keywords` (`keyword`, `weight`) VALUES
('google', 50),
('microsoft', 20),
('apple', 10),
('yahoo', 5),
('bing', 5),
('xing', 5),
('cool', 5);

Query 1

Consumes more ressources, i work on 5k+ records. Source is Why would this MySQL query using rand() return no results about a third of the time?:

SELECT *  FROM `keywords` ORDER BY -LOG(1.0 - RAND()) / weight LIMIT 1

Query 2

Sums up weights to @weight_sum. Sets @weight_point to RAND() number from within that range. Loops through all records, substracting weight from @weight_pos and setting @keyword to the current keywords.keyword. Until @weight_pos < 0. Then it keeps that keyword. Source is Random Weighted Choice in T-SQL

SET @keyword = 0;
SET @weight_sum = (SELECT SUM(weight) FROM keywords);
SET @rand = RAND();
SET @weight_point = ROUND(((@weight_sum - 1) * @rand + 1), 0);
SET @weight_pos = @weight_point;

SELECT
    keyword,
    weight,

    @keyword:=CASE
        WHEN @weight_pos < 0 THEN @keyword
        ELSE keyword
    END AS test,

    (@weight_pos:=(@weight_pos - weight)) AS curr_weight,
    @weight_point,
    @keyword,
    @weight_pos,
    @rand,
    @weight_sum
FROM
    keywords;

See phpmyadmin results here http://postimg.org/image/stgpd776f/

phpmyadmin

My Question

How do i get the value in @keyword, or what the test column holds in the end? Adding a SELECT @keyword afterwards doesn't change anything.

Community
  • 1
  • 1
spaxxUnited
  • 625
  • 1
  • 7
  • 16

1 Answers1

0

Ok, i think my question was more or less a basic mysql-question. I achieved what i wanted by encapsulating the above SELECT statement into another SELECT, that then filtered the first one's result for what i searched. Sorry for bothering you. See the query:

SET @keyword = 0;
SET @weight_sum = (SELECT SUM(weight) FROM keywords);
SET @rand = RAND();
SET @weight_point = ROUND(((@weight_sum - 1) * @rand + 1), 0);
SET @weight_pos = @weight_point;
SELECT t.test FROM (
SELECT
    keyword,
    weight,
    @keyword:=CASE
        WHEN @weight_pos < 0 THEN @keyword
        ELSE keyword
    END AS test,
    (@weight_pos:=(@weight_pos - weight)) AS curr_weight,
    @weight_point,
    #@keyword,
    @weight_pos,
    @rand,
    @weight_sum
FROM
    keywords

) AS t

WHERE
    t.curr_weight < 0
LIMIT
    1;
spaxxUnited
  • 625
  • 1
  • 7
  • 16